Page MenuHomePhabricator

Reduce number of published docker images for revscoring models
Closed, ResolvedPublic

Description

At the moment we are publishing 6 docker images every time we make a new release.
With the recent refactoring each revscoring model is an instance of the RevscoringModel Class.
This could allow us to have one image (with merged requirements.txt) with all the models and each deployment executes a different command for the corresponding model.

Event Timeline

Change 865670 had a related patch set uploaded (by Ilias Sarantopoulos; author: Ilias Sarantopoulos):

[machinelearning/liftwing/inference-services@main] blubber: create universal revscoring image

https://gerrit.wikimedia.org/r/865670

Change 866560 had a related patch set uploaded (by Ilias Sarantopoulos; author: Ilias Sarantopoulos):

[integration/config@master] inference-services: add revscoring pipelines

https://gerrit.wikimedia.org/r/866560

Change 866570 had a related patch set uploaded (by Ilias Sarantopoulos; author: Ilias Sarantopoulos):

[integration/config@master] inference-services: add revscoring pipelines

https://gerrit.wikimedia.org/r/866570

Change 866571 had a related patch set uploaded (by Ilias Sarantopoulos; author: Ilias Sarantopoulos):

[integration/config@master] inference-services: add revscoring pipelines

https://gerrit.wikimedia.org/r/866571

Change 866560 abandoned by Ilias Sarantopoulos:

[integration/config@master] inference-services: add revscoring pipelines

Reason:

duplicate

https://gerrit.wikimedia.org/r/866560

Change 866571 abandoned by Ilias Sarantopoulos:

[integration/config@master] inference-services: add revscoring pipelines

Reason:

duplicate

https://gerrit.wikimedia.org/r/866571

Change 866591 had a related patch set uploaded (by Ilias Sarantopoulos; author: Ilias Sarantopoulos):

[machinelearning/liftwing/inference-services@main] blubber: create universal revscoring image

https://gerrit.wikimedia.org/r/866591

Change 866591 abandoned by Ilias Sarantopoulos:

[machinelearning/liftwing/inference-services@main] blubber: create universal revscoring image

Reason:

duplicate

https://gerrit.wikimedia.org/r/866591

At the moment I have created one image for all revscoring models and managed to run inference through that. We build an image of approx 1.5GB instead of 4 images which should potentially speed up and make our CI/CD process a bit easier.
As you understand the changes in this patch are too many so it requires extensive QA on our side.
Remaining things:

  • merge the patch in the integration/config repo for the new deployment pipeline
  • update deployment charts to use the same image for all revscoring models

My suggestion to proceed would be the following:

  • introduce new image, deploy and test it wherever we want
  • deprecate old files and pipelines.

Change 866570 merged by jenkins-bot:

[integration/config@master] inference-services: add revscoring pipelines

https://gerrit.wikimedia.org/r/866570

Change 865670 merged by jenkins-bot:

[machinelearning/liftwing/inference-services@main] blubber: create universal revscoring image

https://gerrit.wikimedia.org/r/865670

Change 868407 had a related patch set uploaded (by Ilias Sarantopoulos; author: Ilias Sarantopoulos):

[operations/deployment-charts@master] ml-services: use the same image for revscoring models

https://gerrit.wikimedia.org/r/868407

Change 868407 merged by Elukey:

[operations/deployment-charts@master] ml-services: use the same image for revscoring models

https://gerrit.wikimedia.org/r/868407

Change 868433 had a related patch set uploaded (by Ilias Sarantopoulos; author: Ilias Sarantopoulos):

[machinelearning/liftwing/inference-services@main] revscoring: delete individual revscoring images

https://gerrit.wikimedia.org/r/868433

Change 868437 had a related patch set uploaded (by Ilias Sarantopoulos; author: Ilias Sarantopoulos):

[integration/config@master] inference-services: remove old revscoring pipelines

https://gerrit.wikimedia.org/r/868437

Change 868437 merged by jenkins-bot:

[integration/config@master] inference-services: remove old revscoring pipelines

https://gerrit.wikimedia.org/r/868437

Mentioned in SAL (#wikimedia-releng) [2022-12-15T16:51:41Z] <hashar> Reloading Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/868437/ "inference-services: remove old revscoring pipelines" | T323586

Change 868433 merged by jenkins-bot:

[machinelearning/liftwing/inference-services@main] revscoring: delete individual revscoring images

https://gerrit.wikimedia.org/r/868433