Project | Branch | Lines +/- | Subject | |
---|---|---|---|---|
operations/puppet | production | +2 -2 | switch esams & eqsin to lets-encrypt; globalsign OCSP unhappy |
Details
Event Timeline
Change 579459 had a related patch set uploaded (by CDanis; owner: CDanis):
[operations/puppet@production] switch esams & eqsin to lets-encrypt; globalsign OCSP unhappy
Sample error log:
Mar 12 05:42:01 cp3050 CRON[9853]: (root) CMD (/usr/local/sbin/update-ocsp-all 2>&1 | logger -t update-ocsp-all) [...] Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: Traceback (most recent call last): Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: File "/usr/local/sbin/update-ocsp", line 290, in <module> Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: main() Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: File "/usr/local/sbin/update-ocsp", line 283, in main Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: certs_fetch_ocsp(out_tempfile, args) Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: File "/usr/local/sbin/update-ocsp", line 208, in certs_fetch_ocsp Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: (ocsp_text, ocsp_err) = check_output_errtext(cmd) Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: File "/usr/local/sbin/update-ocsp", line 101, in check_output_errtext Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: (" ".join(args), p.returncode, p_err)) Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: Exception: Command openssl ocsp -resp_text -respout /var/cache/ocsp/update-ocsp-TpDAUp.tmp/globalsign-2019-ecdsa-unified.ocsp -issuer /etc/ssl/certs/7cfeae01.0 -verify_other /etc/ssl/certs/7cfeae01.0 -path http://ocsp.globalsign.com/gseccovsslca2018 -host webproxy.esams.wmnet:8080 -cert /etc/ssl/localcerts/globalsign-2019-ecdsa-unified.crt failed with exit code 1, stderr: Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: Error querying OCSP responder Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: 139996330263680:error:27076072:OCSP routines:parse_http_line1:server response error:../crypto/ocsp/ocsp_ht.c:260:Code=503,Reason=Service Unavailable Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: Mar 12 05:42:46 cp3050 update-ocsp-all[9863]: OCSP update failed for /etc/update-ocsp.d/globalsign-2019-ecdsa-unified.conf Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: Traceback (most recent call last): Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: File "/usr/local/sbin/update-ocsp", line 290, in <module> Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: main() Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: File "/usr/local/sbin/update-ocsp", line 283, in main Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: certs_fetch_ocsp(out_tempfile, args) Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: File "/usr/local/sbin/update-ocsp", line 208, in certs_fetch_ocsp Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: (ocsp_text, ocsp_err) = check_output_errtext(cmd) Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: File "/usr/local/sbin/update-ocsp", line 101, in check_output_errtext Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: (" ".join(args), p.returncode, p_err)) Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: Exception: Command openssl ocsp -resp_text -respout /var/cache/ocsp/update-ocsp-uWwQjw.tmp/globalsign-2019-rsa-unified.ocsp -issuer /etc/ssl/certs/036624bb.0 -verify_other /etc/ssl/certs/036624bb.0 -path http://ocsp.globalsign.com/gsrsaovsslca2018 -host webproxy.esams.wmnet:8080 -cert /etc/ssl/localcerts/globalsign-2019-rsa-unified.crt failed with exit code 1, stderr: Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: Error querying OCSP responder Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: 140218955576448:error:27076072:OCSP routines:parse_http_line1:server response error:../crypto/ocsp/ocsp_ht.c:260:Code=503,Reason=Service Unavailable Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: Mar 12 05:42:48 cp3050 update-ocsp-all[9863]: OCSP update failed for /etc/update-ocsp.d/globalsign-2019-rsa-unified.conf
This seems to be triggered by the outage reported by globalsign in https://www.globalsign.com/en/status:
Updated 12 March 2020, 5:25 pm EDT We are still working on recovery measures for the outages reported yesterday. We will share and publish an incident report as soon as we have completed our investigation. We deeply apologize for any inconvenience this may cause to our customers. Please do not hesitate to contact us with any questions. Impacted services: - Certificate application and issue - Certificate revocation procedure - Confirmation of certificate revocation information by OCSP / CRL is intermittent - Time stamping
I'm gonna trigger OCSP response updates in esams & eqsin first, this should get rid of the issue
Mentioned in SAL (#wikimedia-operations) [2020-03-13T06:05:45Z] <vgutierrez> triggering OCSP response updates in esams - T247584
Mentioned in SAL (#wikimedia-operations) [2020-03-13T06:12:45Z] <vgutierrez> triggering OCSP response updates in eqsin - T247584
Mentioned in SAL (#wikimedia-operations) [2020-03-13T06:16:17Z] <vgutierrez> triggering OCSP response updates in eqiad,codfw and ulsfo - T247584
Change 579459 abandoned by Vgutierrez:
switch esams & eqsin to lets-encrypt; globalsign OCSP unhappy
for future reference, OCSP response update can be triggered like this:
sudo -i cumin -b1 'A:cp-eqiad' "/usr/local/sbin/update-ocsp-all 2>&1 | logger -t update-ocsp-all"
Thanks Valentin! I also made some edits on wikitech: https://wikitech.wikimedia.org/w/index.php?title=HTTPS%2FUnified_Certificates&type=revision&diff=1860120&oldid=1773414