Page MenuHomePhabricator

Match model-server dockerfiles with blubber files
Closed, DeclinedPublic

Description

Previously model-server dockerfiles were turnkey. When we worked on blubber files, we kept changing items that the dockerfiles depended on - for example, the requirements.txt file. Right now when dockerfiles are run, they won't work well without a few adjustments.

We should update dockerfiles to match blubber files so that they can run right out of the box.

Event Timeline

@kevinbazira - I think you made a great point earlier today stating that it will be easier for community members to contribute to and/or evaluate our model servers using the Dockerfiles rather than generate one using Blubber. I think the only potential downside might be additional work if we happen to have a much larger number of model-servers in the future. I'm not really too concerned about this in the near-to-medium term, as I don't think the model-servers will change much once they have been deployed to production.

TLDR: Good call on keeping the dockerfiles around, I don't think it will be too much work to keep them in-sync with the blubberfiles for now. We can reevaluate once we have a larger number of model classes.

Partially related, but going to add a note in here as well: the SRE team suggested that we may want to factor out a base image for ores/revscoring/etc.. in the production-images repository (that is basically where SRE keeps all system-related and base images that we use). The repo is managed via Dockerfiles + templating offered by https://doc.wikimedia.org/docker-pkg, and it may be useful to have smaller blubber files. Happy to discuss my experience with production-images during our weekly meeting so we can decide what's best!

Adding also another thought - it would be great that CI alerted us when Dockerfiles are not in sync with the blubber specs. If the community starts using Dockerfiles directly this is probably going to happen:

  1. We update the blubber file and forgot to sync the Dockerfile
  2. A user from the community reports a problem running the Dockerfile
  3. We update the Dockerfile

Maybe a simple bash script stored on the repo that invokes blubber to get the last version of Dockerfiles for all directories could be enough?

Maybe a simple bash script stored on the repo that invokes blubber to get the last version of Dockerfiles for all directories could be enough?

Was thinking about this a bit yesterday, I really like the idea of having CI alert us when things are not in sync, wondering if this should be an additional stage within each of our pipelines, or rather it's own pipeline that gets run when any blubberfiles get updated? I think a bash script would make sense, we could just use diff or cmp to see if the files are similar.

With the advent of a new syntax directive that enables users to run blubber files using the docker build command, we will not need to update dockerfiles to match blubber files.

We experimented with the new syntax directive and added it to all blubber files in T322006.