Page MenuHomePhabricator

Swift TLS certificates will expire soon (14 April)
Closed, ResolvedPublic

Description

The Swift TLS certificates (signed by the Puppet CA) will expire on 14 April; they last 5 years.

They need renewing before then, a process which isn't in the swift docs, there is a manual process noted on wikitech.

I don't think the sometimes-mooted move to cfssl (T356412) would avoid the need for a manual rotation at some point in the future.

Event Timeline

I have updated the docs for the renewal use case, I don't think that we need to change anything in the cert's manifest for this use case (renewal). I have also added a comment related to the cert's revocation in the Puppet CA, that shouldn't cause any harm to running clients since we don't have anything in production to check revoked certs via the Puppet CA yet.

@MatthewVernon we could do something like the following:

  • clean the cert in the Puppet CA - at the point the cert is revoked in the Puppet CA but clients will not check it, since we don't have anything that checks cert expiration in our internal infra (other SREs confirmed).
  • remove manually the current cergen's generated files in the puppet private repo, and then re-run the command to recreate the files. At this point cergen will issue a new-cert-request to the Puppet CA, that will comply since no running certs have swift.discovery.wmnet as CN/SAN (there is no option to simply renew).
  • We disable puppet on swift frontends
  • We copy the new public cert's public key to puppet public, copy the private key in the right spot in puppet private (if needed, I don't recall, maybe not) and file a puppet change.
  • We merge the change for the public repo.
  • Then we depool one node, run puppet and see what envoy does (auto-reload, need a restart, etc..).
  • We test the new cert and if everything looks good, we repool.
  • After some other sanity checks we can proceed with the rest.

How does it sound? I can help/assist in case needed!

Mentioned in SAL (#wikimedia-operations) [2024-04-09T07:54:02Z] <Emperor> puppet cert clean swift_codfw T361844

Change #1018190 had a related patch set uploaded (by MVernon; author: MVernon):

[operations/puppet@production] SSL: update swift_codfw TLS cert

https://gerrit.wikimedia.org/r/1018190

Change #1018190 merged by MVernon:

[operations/puppet@production] SSL: update swift_codfw TLS cert

https://gerrit.wikimedia.org/r/1018190

codfw done OK, cert now says Not After : Apr 8 08:00:23 2029 GMT.

Mentioned in SAL (#wikimedia-operations) [2024-04-09T10:02:58Z] <Emperor> puppet cert clean swift_eqiad T361844

Change #1018227 had a related patch set uploaded (by MVernon; author: MVernon):

[operations/puppet@production] SSL: update swift_eqiad TLS cert

https://gerrit.wikimedia.org/r/1018227

Change #1018227 merged by MVernon:

[operations/puppet@production] SSL: update swift_eqiad TLS cert

https://gerrit.wikimedia.org/r/1018227

eqiad done, Not After : Apr 8 10:04:14 2029 GMT

MatthewVernon claimed this task.

I've added a new section to Swift/How_To that documents this process (and links to the Cergen docs), and also arranged for https://wikitech.wikimedia.org/wiki/TLS/Runbook#swift-https:443 to point to something useful (which is the link that appears in the alert).