Page MenuHomePhabricator

Create helm chart for Speechoid
Open, Needs TriagePublic16 Estimated Story Points

Description

https://wikitech.wikimedia.org/wiki/Deployment_pipeline/Migration/Tutorial#Creating_a_Helm_Chart

How do we define hostnames/ips for dependencies? I.e. Wikispeech-server needs to be aware of the other Speechoid-services. In the docker-compose script on wmflabs this is a fairly simple setup. Also, Speechoid might not start up until the dependent services are available. We are probably setting up a pod that bundles the complete Speechoid-package with all services.

See https://github.com/karlwettin/wikispeech-docker-compose/blob/master/docker-compose.yml

Dependent services are defined in the Wikispeech-server configuration. We will need to modify this to point at the service hostnames in k8s. See https://github.com/karlwettin/wikispeech-docker-compose/blob/master/compose-files/mockup.conf

We currently have a HAProxy in front of MaryTTS built in to our blubber, acting as a request queue (only one request at the time) to avoid overloading the service as each request will hit 100% CPU. Can we configure kubernetes to do this instead? Also consider what happens if we have multiple cores, MaryTTS seems to differ a bit, on my local Ubuntu it will use any available core and max it out, while I've seen it only using a single core on the wmflabs installation. This will indeed require consideration prior to deploying

Event Timeline

kalle updated the task description. (Show Details)

A bit of IRC log from #wikimedia-serviceops:

17:19 < kalle> Does adding dependencies in a Helm chart mean that those dependencies will be bundled as containers running within that same pod?
17:21 < kalle> If so, how would one go about setting up the helm so that some dependency might be running in other pods, potentially automatic up-down-scaling the service? Or is it ment that we should scale the one specific pod up and down?
17:22 < kalle> This is regarding the Wikispeech extension. Our backend, speechoid, is a bundle of quite a few services with rather simple dependencies.
17:23 < kalle> But some of the services are rather heavy on CPU, e.g. the speech synthesis.
17:24 < kalle> I was thinking it would make sense to scale those services only, having k8s balance the requests.
17:26 < kalle> Also, we have installed a HAProxy infront of one of the services, to act as a request queue. Only letting in one request at the time since it will consume 100% of the available threads. It feels as we should let k8s handle that when there are multiple instances up and running.
17:27 < kalle> For reference:
17:27 < kalle> https://gerrit.wikimedia.org/r/admin/repos/q/filter:services+wikispeech
17:28 < kalle> https://www.mediawiki.org/wiki/Wikispeech
17:45 -!- ottomata [sid347637@gateway/web/irccloud.com/x-ghoxbuyonglmfbqv] has joined #wikimedia-serviceops
17:59 < effie> kalle: is there a task related to rolling out speechoid to kubernetes?
18:00 < effie> it would be lovely if we could discuss those details on a task
18:20 < _joe_> kalle: yeah also, new services architectures are usually discussed before getting to the deployment phase with the stakeholders (including SRE)
18:21 < _joe_> maybe that was done. In that case, can you point me to the people you spoke with?
18:21 < _joe_> so that I can get a better idea of how the release was planned
18:23 < _joe_> if not, we will need to take some time to advise you on how to proceed. Horizontal pod autoscaling is not a great way to spawn new workers on demand, unless we are ok with having a lot of latency for individual requests.
18:26 < _joe_> what you probably want to do is to return 503 to the readiness probe while your container is processing a request (or multiple requests if we decide to serve more than one thread from the same pod)
18:26 < _joe_> but again, that won't probably work if not with a small number of incoming requests
22:15 < kalle> effie: https://phabricator.wikimedia.org/T265280
22:16 < kalle> _joe_: We've only talked to releng at this point, making sure they accept how we blubbered things up. 
22:16 < kalle> So this is really the inital ops-contact, working our way towards beta-cluster release.
kalle set the point value for this task to 16.Oct 15 2020, 9:46 AM
kalle moved this task from 🥴 Backlog to 🤕 Watching on the User-kalle board.
kalle moved this task from Backlog to Blocked on the Wikispeech-Jobrunner (Sprint) board.
Sebastian_Berlin-WMSE changed the task status from Open to Stalled.Oct 29 2020, 9:12 AM
Sebastian_Berlin-WMSE changed the task status from Stalled to Open.
Aklapper subscribed.

@kalle: Removing task assignee as this open task has been assigned for more than two years - See the email sent to task assignee on Feburary 22nd, 2023.
Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome! :)
If this task has been resolved in the meantime, or should not be worked on by anybody ("declined"), please update its task status via "Add Action… 🡒 Change Status".
Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator. Thanks!