Page MenuHomePhabricator

[cloudinfra] puppet CA cert expired
Closed, ResolvedPublic

Description

The puppet CA certificate for cloudinfra expired:

root@enc-2:~# openssl s_client -connect cloudinfra-internal-puppetserver-1.cloudinfra.eqiad1.wikimedia.cloud
 140608962757952:error:0200206F:system library:connect:Connection refused:../crypto/bio/b_sock2.c:110:
140608962757952:error:2008A067:BIO routines:BIO_connect:connect error:../crypto/bio/b_sock2.c:111:
connect:errno=111
root@enc-2:~# openssl s_client -connect -showcerts cloudinfra-internal-puppetserver-1.cloudinfra.eqiad1.wikimedia.cloud 8140
s_client: Use -help for summary.
root@enc-2:~# openssl s_client -connect -showcerts cloudinfra-internal-puppetserver-1.cloudinfra.eqiad1.wikimedia.cloud:8140
s_client: must not provide both -connect option and target parameter
s_client: Use -help for summary.
root@enc-2:~# openssl s_client -showcerts -connect cloudinfra-internal-puppetserver-1.cloudinfra.eqiad1.wikimedia.cloud 8140
s_client: must not provide both -connect option and target parameter
s_client: Use -help for summary.
root@enc-2:~# openssl s_client -connect cloudinfra-internal-puppetserver-1.cloudinfra.eqiad1.wikimedia.cloud 8140 -showcerts
s_client: Use -help for summary.
root@enc-2:~# openssl s_client -connect cloudinfra-internal-puppetserver-1.cloudinfra.eqiad1.wikimedia.cloud 8140 
s_client: must not provide both -connect option and target parameter
s_client: Use -help for summary.
root@enc-2:~# openssl s_client -connect cloudinfra-internal-puppetserver-1.cloudinfra.eqiad1.wikimedia.cloud:8140 
CONNECTED(00000003)
depth=1 CN = Puppet CA: cloudinfra-internal-puppetmaster01.cloudinfra.eqiad.wmflabs
verify error:num=10:certificate has expired
notAfter=Mar 31 20:35:10 2024 GMT
verify return:1
depth=1 CN = Puppet CA: cloudinfra-internal-puppetmaster01.cloudinfra.eqiad.wmflabs
notAfter=Mar 31 20:35:10 2024 GMT
verify return:1
depth=0 CN = cloudinfra-internal-puppetserver-1.cloudinfra.eqiad1.wikimedia.cloud
notAfter=Mar 17 20:05:11 2029 GMT
verify return:1
---
Certificate chain
 0 s:CN = cloudinfra-internal-puppetserver-1.cloudinfra.eqiad1.wikimedia.cloud
   i:CN = Puppet CA: cloudinfra-internal-puppetmaster01.cloudinfra.eqiad.wmflabs
 1 s:CN = Puppet CA: cloudinfra-internal-puppetmaster01.cloudinfra.eqiad.wmflabs
   i:CN = Puppet CA: cloudinfra-internal-puppetmaster01.cloudinfra.eqiad.wmflabs

It needs refreshing, though the current CA cert refreshing docs are only for puppet 5 (might work as-is, might not).

https://wikitech.wikimedia.org/wiki/Help:Project_puppetserver#Renewing_puppetserver_CA_certificate

Event Timeline

dcaro triaged this task as High priority.Apr 2 2024, 8:55 AM
dcaro created this task.

The expired cert is:

root@enc-2:~# openssl x509 -in /etc/ssl/certs/Puppet_Internal_CA.pem -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 1 (0x1)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN = Puppet CA: cloudinfra-internal-puppetmaster01.cloudinfra.eqiad.wmflabs
        Validity
            Not Before: Apr  1 20:35:10 2019 GMT
            Not After : Mar 31 20:35:10 2024 GMT

Probably distributed from puppet secrets

It seems that the problem is with the cached certificates, this forced the host to get the newer one:

root@enc-2:~# mv /var/lib/puppet/ssl/certs/ca.pem{,.old}
root@enc-2:~# puppet agent --test 
Info: Caching certificate for ca
Info: Caching certificate for ca
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for enc-2.cloudinfra.eqiad1.wikimedia.cloud
Info: Applying configuration version '(49feca7f20) Marostegui - Revert "db1197: Disable notifications"'
Notice: /Stage[main]/Sslcert::Trusted_ca/Concat[/etc/ssl/certs/wmf-ca-certificates.crt]/File[/etc/ssl/certs/wmf-ca-certificates.crt]/content:
...


root@enc-2:~# openssl x509 -in /etc/ssl/certs/Puppet_Internal_CA.pem -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 1 (0x1)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN = Puppet CA: cloudinfra-internal-puppetmaster01.cloudinfra.eqiad.wmflabs
        Validity
            Not Before: Mar  6 15:29:15 2024 GMT
            Not After : Mar  4 15:29:15 2034 GMT

Will remove the cached file from all the hosts using cumin.

After re-arming the keyholder, ran the command and puppet is back running as usual:

root@cloud-cumin-04:~# keyholder arm  # see the cumin-openstack-master-key-passphrase file in pw


root@cloud-cumin-04:~# cumin 'D{cloud-cumin-04.cloudinfra.eqiad1.wikimedia.cloud,cloudinfra-db[03-04].cloudinfra.eqiad1.wikimedia.cloud,cloudinfra-idp-1.cloudinfra.eqiad1.wikimedia.cloud,enc-1.cloudinfra.eqiad1.wikimedia.cloud,syslog-server-audit[01-02].cloudinfra.eqiad1.wikimedia.cloud}' 'mv /var/lib/puppet/ssl/certs/ca.pem{,.old}&& run-puppet-agent'
100.0% (7/7) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.
dcaro claimed this task.
taavi subscribed.

Somethig is still broken somewhere:

taavi@cloudinfra-internal-puppetserver-1:~$ sudo puppet node clean mx-out03.cloudinfra.eqiad1.wikimedia.cloud
Error: Failed connecting to https://cloudinfra-internal-puppetserver-1.cloudinfra.eqiad1.wikimedia.cloud:8140/puppet-ca/v1/certificate_status/
  Root cause: SSL_connect returned=1 errno=0 peeraddr=172.16.2.88:8140 state=error: certificate verify failed (certificate has expired)
Error: Try 'puppet help node clean' for usage
root@cloudinfra-internal-puppetserver-1:/srv/puppet/server/ssl# mv certs/ca.pem /root/ca_old2.pem
root@cloudinfra-internal-puppetserver-1:/srv/puppet/server/ssl# cp ca/ca_crt.pem certs/ca.pem
root@cloudinfra-internal-puppetserver-1:~# puppet node clean mx-out03.cloudinfra.eqiad1.wikimedia.cloud 
Notice: Certificate for mx-out03.cloudinfra.eqiad1.wikimedia.cloud has been revoked