Page MenuHomePhabricator

Puppet certificate missing subjectAltName
Open, MediumPublic

Description

The generated Puppet certificates are missing the subjectAltName, required by RFC 2818 when they are used as certificates for HTTPS traffic, like for example for PuppetDB API on nitrogen.

Recent libraries like Python urllib3 throws a deprecation warning if the certificate does not have the subjectAltName and will remove support for looking the CN instead in future releases.

/usr/lib/python2.7/dist-packages/urllib3/connection.py:337: SubjectAltNameWarning: Certificate for nitrogen.eqiad.wmnet has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning

We should consider adding the subjectAltName with Puppet option dns_alt_names or use different certificates, see also T150822.

Event Timeline

ema triaged this task as Medium priority.Mar 1 2017, 9:50 AM

Change 340107 had a related patch set uploaded (by Volans; owner: Giuseppe Lavagetto):
[operations/puppet] utils: add create_ecdsa_cert

https://gerrit.wikimedia.org/r/340107

Change 340107 merged by Giuseppe Lavagetto:
[operations/puppet@production] utils: add create_ecdsa_cert

https://gerrit.wikimedia.org/r/340107

Looks like this is still the case for certs issued by puppet CA, the dns_alt_names lists steps to turn this on though https://docs.puppet.com/puppet/latest/configuration.html#dnsaltnames

The create_ecdsa_cert script was patched already, though our puppet server doesn't use dns_alt_names. I think it'd be ok in this case to just regenerate the service certificate and leave the puppet cert alone. @Volans @Joe ?

Change 446789 had a related patch set uploaded (by Volans; owner: Volans):
[operations/puppet@production] cumin: remove disable warning for urllib3

https://gerrit.wikimedia.org/r/446789

Change 446789 merged by Volans:
[operations/puppet@production] cumin: remove disable warning for urllib3

https://gerrit.wikimedia.org/r/446789

FYI, Urllib3 version 2, released in April 2023, removed the fallback from serverAltName to commonName, so it will not be able to connect to internal servers.

This has already started to be a problem (T345309, for example) and will only increase over time as more systems and packages pull in the newer version. It's theoretically possible to restore the old behavior via configuration, but this might be difficult to do through layers of dependencies.

So the sooner we actually fix this, the better 😁

FYI, Urllib3 version 2, released in April 2023, removed the fallback from serverAltName to commonName, so it will not be able to connect to internal servers.

This has already started to be a problem (T345309, for example) and will only increase over time as more systems and packages pull in the newer version. It's theoretically possible to restore the old behaviour via configuration, but this might be difficult to do through layers of dependencies.

So the sooner we actually fix this, the better 😁

Its worth noting that once services have been migrated to the new puppet7 infrastructure then agent certificates will have a correct SAN entry. however the correct fix for this is to migrate any services relying on the puppet agent certificates for TLS to use the pki infrastructure.

Its worth noting that once services have been migrated to the new puppet7 infrastructure then agent certificates will have a correct SAN entry.

Thank you, good to know!

however the correct fix for this is to migrate any services relying on the puppet agent certificates for TLS to use the pki infrastructure.

Could you help me understand what this would look like for us? I've looked at the PKI docs on Wikitech but am still pretty confused.

We have a Python package that uses the Presto Python client to connect the analytics Presto server, and that ultimately relies on Urllib3. There is also Kerberos authentication involved, but I think it's not the source of the problem. You can see the code at https://github.com/wikimedia/wmfdata-python/blob/main/wmfdata/presto.py#L34.

however the correct fix for this is to migrate any services relying on the puppet agent certificates for TLS to use the pki infrastructure.

Could you help me understand what this would look like for us? I've looked at the PKI docs on Wikitech but am still pretty confused.

This is something that will need fixing on the server side, the good news is it seems @BTullis has already started work on this as part of T273642 and i can see that an-test-presto1001.eqiad.wmnet has already been updated to use the new pki infrastructure so you should already be able to test against that. Id recommend reaching out to ben to get an update on when this might make it to production.