Page MenuHomePhabricator

Refactor ORES puppet for Kubernetes
Closed, InvalidPublic

Description

Our goal is to define the quantum execution unit and allow kubernetes to autoscale.

This task is done when puppet can build a reasonable container for ORES to run from.

Event Timeline

Halfak updated the task description. (Show Details)

Change 396055 had a related patch set uploaded (by Awight; owner: Awight):
[operations/puppet@production] Refactor ORES uWSGI workers to use an absolute count

https://gerrit.wikimedia.org/r/396055

Right now, it seems like we want to have one uwsgi worker per celery worker because a uwsgi worker will block while a celery worker generates a score.

We'll also want to be able to receive a surplus of requests that celery can handle with the current worker pool so that we can fill up celery's work queue as well. This is our primary means of implementing backpressure. The celery_queue_size should generally be set to something that can be processed in ~10 seconds. Given the 95% score generation time is about 2.5 seconds, that means our queue should be roughly 4 * total_celery_workers.

total_uwsgi_workers should be at least total_celery_workers + celery_queue_size + 1.

Change 396055 abandoned by Awight:
Refactor ORES uWSGI workers to use an absolute count

https://gerrit.wikimedia.org/r/396055

Change 475487 had a related patch set uploaded (by Ladsgroup; owner: Awight):
[operations/puppet@production] Refactor ORES uWSGI workers to use an absolute count

https://gerrit.wikimedia.org/r/475487

Change 475487 abandoned by Ladsgroup:
Refactor ORES uWSGI workers to use an absolute count

Reason:
It's already done: https://github.com/wikimedia/puppet/blob/production/modules/ores/manifests/web.pp#L4

https://gerrit.wikimedia.org/r/475487

Ladsgroup triaged this task as Medium priority.Nov 26 2018, 5:19 PM

What I don't understand is there won't be any puppet inside containers so none of these modules will be applied, does this really needed? @akosiaris Should we call this invalid?

What I don't understand is there won't be any puppet inside containers so none of these modules will be applied, does this really needed? @akosiaris Should we call this invalid?

There will definitely not be ANY puppet in the containers. Nor will puppet be used to build any of the containers.

That being said, of the 4 bullet points in the task description the only one that is kubernetes specific is:

  • Define quantum execution unit

Which is a different way of saying (it's my terminology actually) we need to figure out what our kubernetes pods are going to be (structure, size, interactions if any etc). This is definitely still debatable/debated for the ORES environment. It is being handled in T210268 and T210267

The other 3

Are more about generic puppet hygiene. Definitely not related to kubernetes and probably not worth pursuing since all that work will be dropped in the kubernetes environment.

Given that the one item in the task that is worth pursuing is being handled in the 2 tasks mentioned above, I 'd say that calling this invalid is correct.