Page MenuHomePhabricator

Migrate purged away from cergen-issued certificate
Open, Needs TriagePublic

Description

cergen is our legacy tooling to manage/generate TLS certificates (https://wikitech.wikimedia.org/wiki/Cergen). It has been replaced by an installation of cfssl (https://wikitech.wikimedia.org/wiki/PKI) and the majority of services uses it.

Our cergen installation is co-hosted on one of the Puppet master (5) frontends (puppetmaster1001), which runs Buster. cergen is based on legacy libraries (it uses networkx v1, which is incompatible with current networkx releases (networkx 2 was released in 2017) and even when the puppetmasters were moved to Buster, this needed a hack to build a co-installable legacy package in a compomnent (T235405).

Instead of forward-porting it yet again to the new installation we'll use the Puppet 5 -> Puppet 7 migration to also phase out cergen and only use cfssl.

Most of those certs are used by Envoy and our Puppet integration makes switching relatively straightforward by switching the profile::tlsproxy::envoy::ssl_provider Hiera flag to "cfssl" (along with specifying SNI names via profile::tlsproxy::envoy::cfssl_options/hosts)

Some examples for this can be found at
https://github.com/wikimedia/operations-puppet/commit/66fbddeac3a4b2dfa1d8e19a49cc649dcb745f18
https://github.com/wikimedia/operations-puppet/commit/a00d0441b4509e736d8abd6ff63f25224e306239

For use cases outside of Envoy the profile::pki::get_cert define provides a convenient method to request certificates. An example how the gradual migration was implemented for the Ganeti RAPI endpoint can be found at https://github.com/wikimedia/operations-puppet/commit/98350d2dff51bb9bf57263fe50f409374892ae1d

For Traffic there's only a single cert defined under the certificate YAML specs defined in /srv/private/modules/secret/secrets/certificates/certificate.manifests.d which need to be moved to PKI/cfssl, the certificate used by purge.

Event Timeline

CR: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1019866

Description of changes:

  • Add a feature flag profile::cache::purged::use_pki to control whether to use cfssl or cergen
  • Retrieve certificate from cfssl's discovery endpoint if use_pki is true
  • Set the feature flag profile::cache::purged::use_pki to true

Change #1019866 had a related patch set uploaded (by CDobbins; author: CDobbins):

[operations/puppet@production] purged: add PKI cert handling

https://gerrit.wikimedia.org/r/1019866

Change #1019866 had a related patch set uploaded (by CDobbins; author: CDobbins):

[operations/puppet@production] purged: add PKI cert handling

https://gerrit.wikimedia.org/r/1019866

Change #1019866 merged by CDobbins:

[operations/puppet@production] purged: add PKI cert handling

https://gerrit.wikimedia.org/r/1019866

Change #1032106 had a related patch set uploaded (by CDobbins; author: CDobbins):

[operations/puppet@production] purged: add Puppet overrides to use cfssl for certs in ulsfo

https://gerrit.wikimedia.org/r/1032106

Change #1035538 had a related patch set uploaded (by CDobbins; author: CDobbins):

[operations/puppet@production] purged: set use_pki to true for drmrs

https://gerrit.wikimedia.org/r/1035538

Change #1035538 merged by CDobbins:

[operations/puppet@production] purged: set use_pki to true for cp6001 in drmrs

https://gerrit.wikimedia.org/r/1035538