Mon, Jun 14
Fri, Jun 11
Wed, Jun 9
Alright, looks like all our current specs have been rewritten to reflect the v1beta1 api and have been confirmed to run well on KFServing 0.5.X.
All new services should use this as well.
Ok great work on @kevinbazira, closing this task. I have captured our next steps in the following tasks:
Tue, Jun 8
@kevinbazira Just adding notes here after digging into all the different model classes (articlequality, drafttopic, etc..) today:
- As you mentioned earlier today, articlequality models only support up to Python3.7 and also use an older version of revscoring, which means we will need a separate model-server image to load those types of revscoring models.
- The editquality models (damaging/goodfaith/reverted/etc.) seem to run well with our current image, although we should think about loading them all into inference services to see if there are any issues like old revscoring dependencies.
- The drafttopic/articletopic models use additional word embeddings that we would either need to package inside a container, or inject via storage. Let's hold off on migrating these to KFServing for now, as there is talk about the language-agnostic Outlink topic model replacing these types of models.
After talking with @elukey, it seems the conversion webhook is not patched by the self-signed-ca.sh (see: T280661) script in KFServing v0.5.x. When the kfserving webhook automatically converts from v1alpha2 to v1beta1, the converted service is not patched with the custom CA. The issue is cleared up when the service is deployed using the v1beta1 and bypasses the conversion webhook.
Fri, Jun 4
Thu, Jun 3
I ran into some issues upgrading the revscoring inference service base image to bullseye (mostly since scipy & numpy have some issues with python3.9 still), so I went with the wmf buster image instead. Things seem to work well so far.
Wed, Jun 2
@kevinbazira I've been thinking about how to structure our repo with the generic revscoring image and all the service config files. There are still some unknowns around deployment and following the SRE guidelines with respect to our inference services, so it is highly likely the codebase structure will change a bit in the near future.
I was talking with @elukey today and he mentioned that we should begin using base images from the WMF docker registry where we can.
This means, the production version of our generic revscoring image should use the WMF Bullseye image instead of Ubuntu (if possible). I will do some testing today using the Bullseye image and will report back.
Tue, Jun 1
Thu, May 27
I uploaded two more models to the public bucket for testing:
I have been able to create two inference services for two models (enwiki.goodfaith and enwiki.damaging) using one generic container.
@kevinbazira this is very exciting! It seems like the generic revscoring image is working well. I have fixed the permissions issue on the gerrit repo and now you should be able to push your code up.
Wed, May 26
@kevinbazira this is great news! glad to hear the generic container approach is working so far. I have uploaded the enwiki-damaging model to our public wmf-ml-models bucket, so you should be able to inject that model into a separate container now. Let me know if you run into any issues.
Tue, May 25
Mon, May 24
Found a really helpful github issue today related to using STORAGE_URI with custom inference services:
It seems mostly related to model servers for various providers, but I have no idea if we need them now or not. Can you shed some light? :D
@elukey -- mostly echoing @Theofpa: I think all we need right now for the MVP is controller & agent from KFServing. The ORES models will be a custom image that we are still finishing and the other model we are working on is the Outlinks topic model, which is another custom image that runs a fastText model. We also might need to do the storage-init as well, but that depends on the outcome of T282802: Implement model storage for enwiki-goodfaith inference service
Going to mark this task as resolved, since we now have three members of the team running the enwiki-goodfaith model as a custom inference service.
May 21 2021
Also confirming that @elukey was able to run a prediction with enwiki-goodfaith on his own minikube instance using some of our own images today (istio etc.).
@kevinbazira excellent work on this! Confirming I am able to use the updated infer.sh script to generate a new session cookie and retrieve a prediction. This is going to save us so much time while developing in the sandbox clusters. Thank you!!
May 20 2021
@kevinbazira awesome! i'm glad you were able to deploy the custom inference service on the sandbox cluster. In response to your thoughts:
May 19 2021
Quick update: I've been doing some testing over the past couple of days and have noticed a timeout issue when testing high throughput loads (like 50-100 calls per second). I traced it down to when we are retrieving all the outlinks via mwapi.Session. After ~100 calls, the outlinks eventually get returned as None: https://github.com/wikimedia/machinelearning-liftwing-inference-services/blob/main/outlink-topic-model/model-server/model.py#L105
@kevinbazira I reviewed your deployed inference service on the KFv1.1 sandbox. So far great progress :)
This is how I am testing our custom KFServing inference services on the KFv1.1 sandbox cluster. The session cookie is required since we are using Dex/Istio authentication. Also you must disable Istio sidecar injection in the CRD yaml in order to reach the service.
May 17 2021
@Legoktm yes I was able to fix this a bit earlier today, forgot to mark this as resolved. Also, can you point me towards some docs for setting up jenkins-bot CI??
Awesome, thank you both! Next I'm going to wire up the storage_uri , but I'll need to setup a custom s3 endpoint. Do either of you know what the Thanos Swift url is? edit: nm I found it https://thanos-swift.discovery.wmnet
Spun up a new VM running Kubeflow v1.3:
May 13 2021
May 12 2021
Thanks @Isaac, I see that reflected in the code now, but didn't have threshold documented with the other params. I've added a patch for that in gerrit.
May 11 2021
As mentioned in T279004, we have successfully deployed the enwiki-goodfaith model as a custom KFServing inference service based off of @kevinbazira's revscoring container image.
From initial testing it seems that we can use the revscoring image as a 'base' image and then inject the model binaries into individual inference services.
May 10 2021
Confirming that the Outlinks topic model can indeed be loaded as a custom KFServing inference service to be used by Lift-Wing .
I was able to package and deploy the model inside our Kubeflow sandbox today.
May 7 2021
@kevinbazira: good news! your revscoring container is now running on the Kubeflow sandbox! The enwiki-goodfaith model is running as a custom inferenceservice via KFServing using the Dockerfile you created.
I am able to hit the service and retrieve a prediction.
Apr 27 2021
@klausman I think most of our models should be ok with minute-granularity. The only case I can think of where it could be problematic is if we train two slightly different versions of the same model in parallel (and they finish within ~60 secs of each other), but I don't see that happening anytime soon. We could also just limit the pipelines to only running one at a time to make sure we avoid it.
Talked about this with @calbon today and determined that toolforge is probably a better place to direct folks to for deploying community models at this time. This has been captured in T281317
Closing this ticket for now, feel free to reopen if we want to re-evaluate cloudvps.
Apr 23 2021
Thanks @Theofpa , this is really helpful right now as I'm working with a sandbox KF install to test some of our models with.