Page MenuHomePhabricator

Move ML docker images to multi-stage build
Closed, ResolvedPublic

Description

In https://gerrit.wikimedia.org/r/c/machinelearning/liftwing/inference-services/+/715918 we moved the editquality docker image to a multi-stage build (see commit msg for more details) and its size went from 1.56GB to 1.12GB (-440MB). We should apply the same pattern to the other model types, and figure out if we can create a base image to re-use across them to avoid repeating configs (not a strict requirement for the task, only a nice to have in case it is possible).

Event Timeline

elukey triaged this task as Medium priority.

Nice job @elukey! Confirming that the 2021-09-01-140944-production version of editquality model server seems to be ~1.12GB now.

I'll try to do something similar with the articlequality model server and see how it goes

Change 716629 had a related patch set uploaded (by Accraze; author: Accraze):

[machinelearning/liftwing/inference-services@main] articlequality: move blubber config to multi-stage

https://gerrit.wikimedia.org/r/716629

Change 717083 had a related patch set uploaded (by Kevin Bazira; author: Kevin Bazira):

[machinelearning/liftwing/inference-services@main] refactor draftquality blubber file to reduce image sizes

https://gerrit.wikimedia.org/r/717083

Thank you for proposing this approach @elukey. I have implemented it on draftquality images and the sizes have reduced.

IMAGE         OLD       NEW
production    1.38GB    796MB
test          756MB     176MB

Change 717083 merged by jenkins-bot:

[machinelearning/liftwing/inference-services@main] refactor draftquality blubber file to reduce image sizes

https://gerrit.wikimedia.org/r/717083

Change 719164 had a related patch set uploaded (by Kevin Bazira; author: Kevin Bazira):

[machinelearning/liftwing/inference-services@main] refactor topic to reduce image sizes

https://gerrit.wikimedia.org/r/719164

Change 716629 merged by jenkins-bot:

[machinelearning/liftwing/inference-services@main] articlequality: move blubber config to multi-stage

https://gerrit.wikimedia.org/r/716629

Change 719164 merged by Accraze:

[machinelearning/liftwing/inference-services@main] refactor topic to reduce image sizes

https://gerrit.wikimedia.org/r/719164

ACraze claimed this task.

Marking this as RESOLVED since all revscoring production images have been moved to multi-stage builds. Nice job everyone!

editquality: 1.56GB -> 1.12 GB
articlequality: 1.2 GB -> 800 MB
draftquality: 1.38 GB -> 796 MB
topic: 1.38 GB -> 916MB