User story: As a product owner, I want the machine learning model that supports my product/feature to be hosted on Lift Wing.
Description
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | kevinbazira | T348156 Goal: Increase the number of models hosted on Lift Wing | |||
Open | None | T348850 Establish a standard load testing procedure | |||
Open | None | T351939 Document load test results | |||
Open | isarantopoulos | T355394 Investigate way of comparing load test results |
Event Timeline
Comment Actions
Update: Language ID model deployed. Diff blog posted. Recommendation API work continues.
Comment Actions
1.Isaac from the research team tested the deployed rec-api and shared 2 edge cases:
- the rec-api wasn't returning results besides 'spec' param, we investigated this in T347475, noticed envoy proxy constraints, and fixed them in T348607.
- the rec-api was returning empty results when a query was made with 'seed' param not specified, we discovered that the pageviews envoy settings weren't correct and fixed them. We later updated them based on the wp-analytics team notice in T348607#9283681.
2.Started working on migrating the machine-generated article descriptions model from toolforge to LiftWing
Comment Actions
Working on migrating the machine-generated article descriptions model from toolforge to LiftWing:
- added article-descriptions model-server to LiftWing inference services repo
- added CI pipeline jobs to test and publish the article-descriptions model-server image to the Wikimedia docker registry
- uploaded article-descriptions model files to swift in mbart-large-cc25 and bert-base-multilingual-uncased paths.
- added the article-descriptions inference service to the experimental namespace on LiftWing
- fixed the model-server to use the local_files_only parameter to instantiate the pretrained pytorch tokenizer from local files only without having to download from huggingface.co.
- in T351940#9359437 fixed the AsyncSession host header issue experienced in T351940#9358303.
- currently working on fixing the wikipedia api summary endpoint as we have to use a k8s internal endpoint to access it.
Comment Actions
Working on migrating the machine-generated article descriptions model from toolforge to LiftWing:
- fixed the wikipedia api summary endpoint. now the model-server uses the http://rest-gateway.discovery.wmnet:4111/en.wikipedia.org/v1/page/summary/Clandonald k8s internal endpoint to access it.
- built an image and tested the article-descriptions local-runs set up then provided reviews for its patch.
- ran load tests for the article-descriptions isvc currently hosted in the experimental namespace on LiftWing.
- communicated to Isaac and Seddon requesting them to test the model-server before we move it from the experimental namespace. so far a prediction bug and latency issues have been reported.
- investigating prediction bug in article-descriptions model-server.