Page MenuHomePhabricator

Automate issuing of TLS certificates in kubernetes clusters
Closed, ResolvedPublic

Description

With more automation/self-service coming via Istio-Ingressgateway (T209066) we should also improve how we issue and deploy TLS certificates to services running on out Kubernetes clusters.

The current process is described here and involves quite some manual steps. It also requires an SRE, as root is needed.

With the cfssl based PKI now being available we should aim for a integration with cert-manager as the de-facto standard in Kubernetes world.
ML also took a look at cert-manager (T280661) and decided not to use it for now (but they do issue way less certificates than we do).

There are basically two ways we could integrate cert-manager:

1. CA issuer

https://cert-manager.io/docs/configuration/ca/

This is part of the standard implementation of cert-manager and would require us to create an intermediate (dedicated to Kubernetes clusters) using our PKI and provide that to the Kubernetes clusters. The clusters cert-manager instances could then issue certificated based on that.
The obvious downside of this is that we create some kind of "split brain CA" as each Kubernetes cluster would issue certificates with the same intermediate not knowing about the others. Also we would have to manage the intermediate ourselves (renew etc.).

2. CFSSL (API) issuer

cert-manager supports external Issuers and such one could be used to have certificates issued directly via pki.discovery.wmnet.

This would allow us to rely on the PKI infrastructure and have certificates issued with the "discovery" intermediate that is already managed there. Also this will retain the single source of truth regarding which certificates have been issues/are valid and relieves us from the burden of managing an intermediate.

While this seems more like "the right way" to do it, there is currently only one implementation (I could find) of a cfssl Issuer: https://github.com/OpenSource-THG/cfssl-issuer

2.1 OpenSource-THG/cfssl-issuer

After talking to @jbond about this it seems as if we're okay with calling the CFSSL API of our PKI directly with some sort of authentication applied. I took a closer look at the OpenSource-THG/cfssl-issuer then to verify if it could work for us.
Unfortunately the issuer does not support any kind of authentication towards the CFSSL API so I decided to hack that in for an initial test.
The work on that revealed that the issuer is in a not ideal shape as it seems to not follow the standards of cert-manager (anymore?), there is a lot of duplicate code and, while being able to issue the right API calls and receiving the certificate from CFSSL, I wasn't able to make it actually reconcile the Certificate/Secret objects in Kubernetes correctly. I ultimately stopped debugging it to first write this task.

To continue with this, we could:

  • Try to update/fix the OpenSource-THG/cfssl-issuer (maybe with help of the initial developers, although the project does not seem very active)
  • Start our own implementation from scratch. Might sound weird at first, but cert-manager provides kind of an SDK/scaffold around this which is regularly updated and our use case (calling an external API) is not that complex after all.

2.2 own cfssl-issuer implementation

I went ahead by creating our own cfssl-issuer implementation due to the fact that it's not very hard to do and it seemed harder to clean/fix the existing codebase.

Follow up things to do:

  • Build cert-manager docker images (v1.5.4, last one compatible/tested with k8s v1.16)
  • Import cert-manager helm chart
  • Build cfss-issuer docker images
  • Write cfssl-issuer helm chart
  • Write admin_ng helmfile to install cert-manager & cfss-issuer
  • Come up with a proper idea of how to provision the certificate objects in k8s (as the resulting secrets need to be in istio-system namespace) T295385
  • Write some docs https://wikitech.wikimedia.org/wiki/Kubernetes/cert-manager

Details

SubjectRepoBranchLines +/-
operations/deployment-chartsmaster+1 -0
operations/deployment-chartsmaster+52 -26
operations/docker-images/production-imagesmaster+8 -1
operations/software/cfssl-issuermain+8 K -66
operations/software/cfssl-issuermain+1 -1
operations/deployment-chartsmaster+45 -2
operations/software/cfssl-issuermain+12 -16
operations/software/cfssl-issuermain+16 -5
operations/deployment-chartsmaster+22 -1
operations/deployment-chartsmaster+23 -0
operations/deployment-chartsmaster+2 -2
operations/deployment-chartsmaster+40 -17
operations/deployment-chartsmaster+11 -10
operations/deployment-chartsmaster+1 -3
operations/deployment-chartsmaster+2 -2
operations/deployment-chartsmaster+62 -2
operations/deployment-chartsmaster+32 -6
operations/deployment-chartsmaster+3 -3
operations/deployment-chartsmaster+3 -0
operations/docker-images/production-imagesmaster+27 -3
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+96 -0
operations/docker-images/production-imagesmaster+1 -1
operations/docker-images/production-imagesmaster+26 -0
operations/software/cfssl-issuermain+251 -8
operations/software/cfssl-issuermain+1 K -969
operations/deployment-chartsmaster+84 -2
operations/deployment-chartsmaster+596 -0
operations/docker-images/production-imagesmaster+3 -3
operations/puppetproduction+18 -1
operations/deployment-chartsmaster+18 K -0
operations/software/cfssl-issuermain+230 -299
operations/docker-images/production-imagesmaster+81 -1
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 736807 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/software/cfssl-issuer@main] Rename everything to cfssl-issuer, ensure e2e completed

https://gerrit.wikimedia.org/r/736807

Change 736808 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/software/cfssl-issuer@main] Implement CFSSL API signer

https://gerrit.wikimedia.org/r/736808

Change 736809 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/software/cfssl-issuer@main] Add simple-cfssl image for development and e2e tests

https://gerrit.wikimedia.org/r/736809

Change 693826 had a related patch set uploaded (by JMeybohm; author: Elukey):

[operations/docker-images/production-images@master] Add Jetstack's cert-manager (v1.5.4) images

https://gerrit.wikimedia.org/r/693826

Change 737167 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] Import chart cert-manager v1.5.4

https://gerrit.wikimedia.org/r/737167

Change 737169 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] Add cfssl-issuer and cfssl-issuer-crds chart

https://gerrit.wikimedia.org/r/737169

Change 693826 merged by JMeybohm:

[operations/docker-images/production-images@master] Add Jetstack's cert-manager (v1.5.4) images

https://gerrit.wikimedia.org/r/693826

Change 737329 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/docker-images/production-images@master] Add cfss-issuer docker image

https://gerrit.wikimedia.org/r/737329

Change 737335 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] profile::docker::builder: Add building cert-manager images

https://gerrit.wikimedia.org/r/737335

Change 736807 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Rename everything to cfssl-issuer, ensure e2e completed

https://gerrit.wikimedia.org/r/736807

Change 737167 merged by jenkins-bot:

[operations/deployment-charts@master] Import chart cert-manager v1.5.4

https://gerrit.wikimedia.org/r/737167

Change 737335 merged by JMeybohm:

[operations/puppet@production] profile::docker::builder: Add building cert-manager images

https://gerrit.wikimedia.org/r/737335

Change 737412 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/docker-images/production-images@master] Fix cert-manager build image name

https://gerrit.wikimedia.org/r/737412

Change 737412 merged by JMeybohm:

[operations/docker-images/production-images@master] Fix cert-manager build image name

https://gerrit.wikimedia.org/r/737412

Change 737169 merged by jenkins-bot:

[operations/deployment-charts@master] Add cfssl-issuer and cfssl-issuer-crds chart

https://gerrit.wikimedia.org/r/737169

Change 737939 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Add helmfile for cert-manager and cfssl-issuer

https://gerrit.wikimedia.org/r/737939

Change 738182 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] cfssl-issuer: Allow to install issuers via chart

https://gerrit.wikimedia.org/r/738182

Change 738182 merged by jenkins-bot:

[operations/deployment-charts@master] cfssl-issuer: Allow to install issuers via chart

https://gerrit.wikimedia.org/r/738182

Change 736809 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Add simple-cfssl image for development and e2e tests

https://gerrit.wikimedia.org/r/736809

Change 736808 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Implement CFSSL API signer

https://gerrit.wikimedia.org/r/736808

Change 737329 merged by JMeybohm:

[operations/docker-images/production-images@master] Add cfssl-issuer docker image

https://gerrit.wikimedia.org/r/737329

Change 738368 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/docker-images/production-images@master] cfssl-issuer: Fix path typo in dockerfile

https://gerrit.wikimedia.org/r/738368

Change 738368 merged by JMeybohm:

[operations/docker-images/production-images@master] cfssl-issuer: Fix path typo in dockerfile

https://gerrit.wikimedia.org/r/738368

Change 737939 merged by JMeybohm:

[operations/deployment-charts@master] admin_ng: Add helmfile for cert-manager and cfssl-issuer

https://gerrit.wikimedia.org/r/737939

Change 745757 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] Deploy cert-manager after namespace creation

https://gerrit.wikimedia.org/r/745757

Change 745757 merged by JMeybohm:

[operations/deployment-charts@master] Deploy cert-manager after namespace creation

https://gerrit.wikimedia.org/r/745757

Change 745760 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Explicitly set dependency to namespaces in cert-manager

https://gerrit.wikimedia.org/r/745760

Change 745761 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/docker-images/production-images@master] cert-manager: Use numeric UID for nobody

https://gerrit.wikimedia.org/r/745761

Change 745763 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Bump cert-manager image version

https://gerrit.wikimedia.org/r/745763

Change 745761 merged by JMeybohm:

[operations/docker-images/production-images@master] cert-manager: Use numeric UID for nobody

https://gerrit.wikimedia.org/r/745761

Change 745760 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Explicitly set dependency to namespaces in cert-manager

https://gerrit.wikimedia.org/r/745760

Change 745763 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Bump cert-manager image version

https://gerrit.wikimedia.org/r/745763

Change 745893 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Add cert-manager networkpolicies

https://gerrit.wikimedia.org/r/745893

Change 745894 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] cert-manager/cfssl allow override of KUBERNETES_SERVICE envs

https://gerrit.wikimedia.org/r/745894

Change 745893 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Add cert-manager networkpolicies

https://gerrit.wikimedia.org/r/745893

Change 745894 merged by jenkins-bot:

[operations/deployment-charts@master] cert-manager/cfssl allow override of KUBERNETES_SERVICE envs

https://gerrit.wikimedia.org/r/745894

Change 745907 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] cfssl-issuer: Fix image tag

https://gerrit.wikimedia.org/r/745907

Change 745907 merged by JMeybohm:

[operations/deployment-charts@master] cfssl-issuer: Fix image tag

https://gerrit.wikimedia.org/r/745907

Change 745912 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] cfssl-issuer: Rely on docker entrypoint rather than command in chart

https://gerrit.wikimedia.org/r/745912

Change 745912 merged by JMeybohm:

[operations/deployment-charts@master] cfssl-issuer: Rely on docker entrypoint rather than command in chart

https://gerrit.wikimedia.org/r/745912

Change 745923 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] cfssl-issuer: Fix metrics listen address and probes

https://gerrit.wikimedia.org/r/745923

Change 745923 merged by jenkins-bot:

[operations/deployment-charts@master] cfssl-issuer: Update to new cfss-issuer version

https://gerrit.wikimedia.org/r/745923

Change 746874 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] Allow cfssl-issuer to connect to pki.discovery.wmnet

https://gerrit.wikimedia.org/r/746874

Change 746874 merged by jenkins-bot:

[operations/deployment-charts@master] Allow cfssl-issuer to connect to pki.discovery.wmnet

https://gerrit.wikimedia.org/r/746874

Change 746878 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] cfssl-issuer: Bump chart version

https://gerrit.wikimedia.org/r/746878

Change 746878 merged by JMeybohm:

[operations/deployment-charts@master] cfssl-issuer: Bump chart version

https://gerrit.wikimedia.org/r/746878

Change 747124 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] cert-manager: Allow ingress to webhook from k8s master and nodes

https://gerrit.wikimedia.org/r/747124

Change 747124 merged by jenkins-bot:

[operations/deployment-charts@master] cert-manager: Allow ingress to webhook from k8s master and nodes

https://gerrit.wikimedia.org/r/747124

Change 747517 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] cert-manager: Define resources for all deployments

https://gerrit.wikimedia.org/r/747517

Change 747517 merged by jenkins-bot:

[operations/deployment-charts@master] cert-manager: Define resources for all deployments

https://gerrit.wikimedia.org/r/747517

Change 748141 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/software/cfssl-issuer@main] Upgrade simple-cfssl to forked version wmf-dev

https://gerrit.wikimedia.org/r/748141

Change 748142 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/software/cfssl-issuer@main] Use vendored dependencies for docker builds from source tree

https://gerrit.wikimedia.org/r/748142

Change 748143 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/software/cfssl-issuer@main] Add support for returning bundles instead of certs from sign calls

https://gerrit.wikimedia.org/r/748143

Change 748141 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Upgrade simple-cfssl to forked version wmf-dev

https://gerrit.wikimedia.org/r/748141

Change 748142 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Use vendored dependencies for docker builds from source tree

https://gerrit.wikimedia.org/r/748142

Change 748703 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] Add generic probes/metrics networkpolicy to cert-manager/cfssl

https://gerrit.wikimedia.org/r/748703

Change 748703 merged by jenkins-bot:

[operations/deployment-charts@master] Add generic probes/metrics networkpolicy to cert-manager/cfssl

https://gerrit.wikimedia.org/r/748703

Change 749151 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/software/cfssl-issuer@main] Update simple-cfssl to wmf branch

https://gerrit.wikimedia.org/r/749151

Change 749151 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Update simple-cfssl to wmf branch

https://gerrit.wikimedia.org/r/749151

Change 749689 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] cfssl-issuer: Update to version 0.2.0

https://gerrit.wikimedia.org/r/749689

Change 749690 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/docker-images/production-images@master] cfssl-issuer: Update to v0.2.0

https://gerrit.wikimedia.org/r/749690

Change 748143 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Add support for returning bundles instead of certs from sign calls

https://gerrit.wikimedia.org/r/748143

Change 749690 merged by JMeybohm:

[operations/docker-images/production-images@master] cfssl-issuer: Update to v0.2.0

https://gerrit.wikimedia.org/r/749690

Mentioned in SAL (#wikimedia-operations) [2022-01-03T14:46:33Z] <jayme> published image docker-registry.discovery.wmnet/cfssl-issuer:0.2.0-1 - T294560

Change 749689 merged by jenkins-bot:

[operations/deployment-charts@master] cfssl-issuer: Update to version 0.2.0

https://gerrit.wikimedia.org/r/749689

Change 751760 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Make the cfssl-issuer return bundles

https://gerrit.wikimedia.org/r/751760

Change 751760 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Make the cfssl-issuer return bundles

https://gerrit.wikimedia.org/r/751760

JMeybohm updated the task description. (Show Details)

This is deployed to staging-codfw. Docs and links to all relevant repos can be found at https://wikitech.wikimedia.org/wiki/Kubernetes/cert-manager

Thanks @jbond for all the support with this!

Change 736807 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Rename everything to cfssl-issuer, ensure e2e completed

https://gerrit.wikimedia.org/r/736807

Change 736808 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Implement CFSSL API signer

https://gerrit.wikimedia.org/r/736808

Change 736809 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Add simple-cfssl image for development and e2e tests

https://gerrit.wikimedia.org/r/736809

Change 748141 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Upgrade simple-cfssl to forked version wmf-dev

https://gerrit.wikimedia.org/r/748141

Change 748142 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Use vendored dependencies for docker builds from source tree

https://gerrit.wikimedia.org/r/748142

Change 749151 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Update simple-cfssl to wmf branch

https://gerrit.wikimedia.org/r/749151

Change 748143 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Add support for returning bundles instead of certs from sign calls

https://gerrit.wikimedia.org/r/748143