Page MenuHomePhabricator

Install KFServing standalone
Open, Needs TriagePublic

Description

We need to install KFServing standalone our ml-serve k8s cluster.

Requirements:

  • k8s cluster with at least 4 cpus and 8Gi memory
  • Istio service mesh
  • Knative Serving (and Eventing if we want transformers/explainers)
  • Cert Manager / LetsEncrypt

Install docs: https://github.com/kubeflow/kfserving#standalone-kfserving-installation

Event Timeline

Here are my notes from installing KFServing on GKE last quarter: https://etherpad.wikimedia.org/p/kfserving-standalone
TLDR: there is a weird version mismatch issue between Knative & Istio that can potentially prevent us from using the transformer (pre/post-processing) and explainabilty features in KFServing.

@calbon mentioned in chat:

KFServing's explainability feature (i.e. an API endpoint for each model that provides some insights into how a model came to some conclusion) is a nice-to-have. However, transformers (i.e. a feature that allows api requests to be pre-processed before being submitted to a model for a prediction) is critical because it is part of the current ORES featureset

For now let's try to just follow the directions in the KFServing README and setup Istio to handle cluster-interal traffic so we can use transformers.

you might also find the quick install script interesting, it has a more fresh version of Istio&knative (1.6 & 0.18) than the run-e2e-tests.sh (1.3 & 0.17). I'm currently rewriting the run-e2e-tests.sh to migrate the test to a tekton pipeline and will keep you posted. I'm targeting Istio 1.7 and Knative 0.20. One more thing, the helm installation method is nice, but both Istio&Knative also have operators for installation nowadays, which can make future upgrades easier.

I've started a WIP PR which supports knative 0.20 and Istio 1.7.1 here.

Thanks @Theofpa ! We are considering helm as it is part of our SRE stack used across the Foundation, however I can see the operators being very beneficial for long term use.