After migrating the pki infrastructure to puppet7 we started to see issues with certificate renewal. It seems the main error seen was
2023/10/31 10:10:37 [INFO] generate received request 2023/10/31 10:10:37 [INFO] received CSR 2023/10/31 10:10:37 [INFO] generating key: ecdsa-256 2023/10/31 10:10:37 [INFO] encoded CSR 2023/10/31 10:10:37 [INFO] Using client auth with mutual-tls-cert: /etc/cfssl/mutual_tls_client_cert.pem and mutual-tls-key: /var/lib/puppet/ssl/private_keys/pki2002.codfw.wmnet.pem 2023/10/31 10:10:37 [INFO] Using trusted CA from tls-remote-ca: /etc/ssl/certs/wmf-ca-certificates.crt {"code":7400,"message":"failed POST to https://pki.discovery.wmnet:443/api/v1/cfssl/authsign: Post \"https://pki.discovery.wmnet:443/api/v1/cfssl/authsign\": x509: issuer name does not match subject from issuing certificate"}
which is caused by the strict processing in go and the lax ssl implementation in puppet. correction: it seems its actully go that is in the wrong here
It was also noticed that the ocsp refresh process was failing with
ERROR:root:debmonitor issue with SQL query: (2003, "Can't connect to MySQL server on 'm1-master.eqiad.wmnet' ([SSL:CERTIFICATE_VERIFY_FAILED] certificate veri>
which was fixed by updating the ca trust bundle
To fix the issue i have now de-pooled pki2002 which is still using puppet7 so we can debug and rolled back pki1001 to puppet5
The first issues in codfw seems to have occurred at 00:05:33 and in eqiad at 00:23:53