Page MenuHomePhabricator

Install certmanager on ml-serve cluster (if needed)
Closed, DeclinedPublic

Description

The certmanager component is used by Kubeflow to handle TLS certificates. We should review if we need it, and if so it it can be integrated with our infrastructure (either the Puppet CA or the new PKI infrastructure).

Event Timeline

Updates:

  • it is still not super clear how/if cert-manager is used/needed, but we can start reviewing https://github.com/kubeflow/kfserving/blob/master/hack/self-signed-ca.sh to see what webhooks are modified and how.
  • Ideally we could create manually TLS certs and provide them as secrets in our k8s cluster, but more research is needed.
  • If cert-manager is needed we'll need to create its Dockerfile for our registry plus import the CRD config to our helm repositories as well.

Change 693826 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/docker-images/production-images@master] Add Jetstack's cert-manager base go images.

https://gerrit.wikimedia.org/r/693826

it is still not super clear how/if cert-manager is used/needed, but we can start reviewing https://github.com/kubeflow/kfserving/blob/master/hack/self-signed-ca.sh to see what webhooks are modified and how.

I took a quick look at this script and i think it should be fairly easy for us to issue a intermediate certificate and use this with cert-manager CA Issuer.

After some readings and help from the Kfserving community I think that cert-manager is not needed for our use case. I am going to add my understanding of what we need so that people can chime in and comment :)

The TLS certificates are needed for three use cases:

  1. ingress-gateway - this is something that we can provision manually following what the other k8s services do, namely creating the cert via cergen and deploy it as cluster secret via helm.
  2. istio sidecar injection - if we wanted to encrypt and authenticate traffic from kfserving pods we could do it via cert-manager, injecting them to istio so that the envoy sidecars could leverage them. This is not our case since we are not going to use istio sidecars (no cross-dc traffic).
  3. kfserving webhook TLS encryption - After reading this link I finally got what upstream told me several times on slack, and what self-signed-ca.sh does behind the scenes. The webhook kfserving server is an https endpoint that implements a REST API, that the k8s api calls in certain occasions when it needs extra validation steps form kfserving's logic. Since the k8s api is https only, then it demands all apis that it calls to follow the same rule. In order to trust the kfserving webhook https endpoint, it needs to know what CA to trust. What the self-signed-ca.sh script does is to a) create a self-signed CA and b) create a TLS certificate for the webhook server endpoint signed by it. It then deploys both to k8s overriding the previous config (see quick_install.sh).

We care about 1) and 3), so cert-manager can be avoided for a simpler in-house solutions. I had a chat with John and we could create some automation that creates/renews certs in cfssl, and maybe deploy the automation on our k8s master nodes. The self-signed CA trick could be used as well for 3), but then we'd need to save the self-signed CA + certs somewhere (like puppet private) etc.. Both can be done easily, it is just a matter of figuring out what we prefer to do. In any case, I'd avoid to deploy cert-manager entirely (that allows us to skip a 7k+ yaml config file + docker images).

Change 693826 merged by JMeybohm:

[operations/docker-images/production-images@master] Add Jetstack's cert-manager (v1.5.4) images

https://gerrit.wikimedia.org/r/693826