We currently issue TLS certificate for the service-proxy (envoy) manually via cergen (https://wikitech.wikimedia.org/wiki/Kubernetes/Enabling_TLS#Create_and_place_certificates).
When cert-manager (T294560) is deployment to all clusters , we should move the service-proxy certificate creation away from cergen and into the helm charts. We should ensure that we issue those certificates for cluster internal as well as the "usual" external names.
When that is done, the validation of SANs in istio-ingressgateway should be limited to the internal FQDN of a service (https://gerrit.wikimedia.org/g/operations/deployment-charts/+/refs/changes/74/732374/15/common_templates/0.4/_ingress_helpers.tpl#180)
todos:
- add module code to support cert-manager certificates
- deploy and at least one chart for >2 days (cert rotation in staging) including ingress
- enable cert-manager by default in helmfile globals
- update all charts and switch all services to cert-manager
- apertium
- api-gateway https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/972404
- api-gateway
- rest-gateway
- benthos https://gerrit.wikimedia.org/r/c/969366
- blubberoid
- calculator-service https://gerrit.wikimedia.org/r/c/967403
- cassandra-http-gateway
- image-suggestion
- changeprop https://gerrit.wikimedia.org/r/c/958484
- chromium-render https://gerrit.wikimedia.org/r/c/958473
- citoid
- cxserver
- datahub https://gerrit.wikimedia.org/r/c/969345
- developer-portal https://gerrit.wikimedia.org/r/c/958479
- eventgate https://gerrit.wikimedia.org/r/c/959181
- eventstreams https://gerrit.wikimedia.org/r/c/967402
-
fastapi-app(replaced by python-webapp) - flink-app
- flink-session-cluster https://gerrit.wikimedia.org/r/c/969343
- function-evaluator
- function-orchestrator
- ipoid
- linkrecommendation
- machinetranslation https://gerrit.wikimedia.org/r/c/960625
- mathoid https://gerrit.wikimedia.org/r/c/953261
- mediawiki
- miscweb https://gerrit.wikimedia.org/r/c/958476
- mobileapps https://gerrit.wikimedia.org/r/c/967405
- push-notifications https://gerrit.wikimedia.org/r/961078
- python-webapp
- recommendation-api https://gerrit.wikimedia.org/r/c/967406
- shellbox https://gerrit.wikimedia.org/r/c/967410
- shellbox (score)
- shellbox-constraints
- shellbox-media
- shellbox-syntaxhighlight
- shellbox-timeline
- similar-users https://gerrit.wikimedia.org/r/c/967473
- tegola-vector-tiles https://gerrit.wikimedia.org/r/961077
- termbox https://gerrit.wikimedia.org/r/c/967412
- thumbor
- toolhub
- wikifeeds https://gerrit.wikimedia.org/r/c/967414
- zotero https://gerrit.wikimedia.org/r/c/967415
- remove cergen code from helm mesh module, scaffolding etc. as well as it's values from all the fixtures
- remove trust for the various subjectAltNames from cergen in ingress module (see: ingress.istio.destinationrule)
- drop the certs from private puppet
To update a chart:
Check cergen config for additional alt names in:
/srv/private/modules/secret/secrets/certificates/certificate.manifests.d/kube_services.certs.yaml
Note: The following domain names can be omitted as they are included by default:
- discovery.wmnet,
- svc.codfw.wmnet
- svc.eqiad.wmnet
Update the chart and deployment config
CHART=apertium sextant update charts/${CHART}/ mesh.deployment git add charts/${CHART}/ # If additional alt names are required, they must be added as a list under mesh.certmanager.extraFQDNs vim helmfile.d/services/${CHART}/values.yaml git add helmfile.d/services/${CHART}/values.yaml git commit -m "Update ${CHART} to use certmanager certs" -m "This updates the chart to use a certmanager instead of cergen certificates" -m "Bug: T300033"