In order to deploy the draftquality models on LiftWing, we need to upload the model files to storage, so our Inference Services can download the binaries and mount them into the pod.
We currently are using Thanos Swift for production storage and have configured s3cmd to work on the statboxes.
The model_upload script is available on the statboxes and should be used to push our models to the wmf-ml-models bucket, using the format determined in T280467
The script can be used as follows:
model_upload model.bz2 <model-type> <model-lang> wmf-ml-models
Once the models are uploaded to storage, we can reference them in our Inference Service specs using an env var:
- name: STORAGE_URI value: "s3://wmf-ml-models/draftquality/enwiki/202107141649/"
Workflow:
- Go through draftquality/models directory
- For each model:
- get model-lang and model-type (draftquality)
- rename model to model.bz2
- upload model to storage