Page MenuHomePhabricator

certspotter failures on alert1001
Closed, ResolvedPublic

Description

I noticed today that certspotter.service fails regularly on alert1001 (see below). The failures are a combination of 429 and 500 from https://yeti2023.ct.digicert.com, for the 500s there's probably not very much we can do (maybe retry more before failing?) and the 429 maybe there are mechanism to ask for more quota? what do you think @ssingh ?

1-- Logs begin at Wed 2022-09-28 11:52:17 UTC, end at Thu 2022-09-29 09:22:13 UTC. --
2Sep 29 00:07:26 alert1001 systemd[1]: certspotter.service: Succeeded.
3Sep 29 00:37:44 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
4Sep 29 01:01:43 alert1001 certspotter[8987]: 2022/09/29 00:54:50 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
5Sep 29 01:01:43 alert1001 certspotter[8987]: 2022/09/29 00:54:50 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
6Sep 29 01:01:43 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
7Sep 29 01:01:43 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
8Sep 29 01:32:02 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
9Sep 29 01:57:41 alert1001 certspotter[31997]: 2022/09/29 01:49:08 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
10Sep 29 01:57:41 alert1001 certspotter[31997]: 2022/09/29 01:49:08 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
11Sep 29 01:57:41 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
12Sep 29 01:57:41 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
13Sep 29 01:59:25 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
14Sep 29 02:16:32 alert1001 certspotter[9407]: 2022/09/29 02:16:32 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
15Sep 29 02:16:32 alert1001 certspotter[9407]: 2022/09/29 02:16:32 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
16Sep 29 02:16:32 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
17Sep 29 02:16:32 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
18Sep 29 02:28:41 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
19Sep 29 02:45:48 alert1001 certspotter[2012]: 2022/09/29 02:45:48 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 429 Too Many Requests (<html>
20Sep 29 02:45:48 alert1001 certspotter[2012]: <head><title>429 Too Many Requests</title></head>
21Sep 29 02:45:48 alert1001 certspotter[2012]: <body>
22Sep 29 02:45:48 alert1001 certspotter[2012]: <center><h1>429 Too Many Requests</h1></center>
23Sep 29 02:45:48 alert1001 certspotter[2012]: <hr><center>nginx</center>
24Sep 29 02:45:48 alert1001 certspotter[2012]: </body>
25Sep 29 02:45:48 alert1001 certspotter[2012]: </html>
26Sep 29 02:45:48 alert1001 certspotter[2012]: )
27Sep 29 02:45:48 alert1001 certspotter[2012]: 2022/09/29 02:45:48 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 429 Too Many Requests (<html>
28Sep 29 02:45:48 alert1001 certspotter[2012]: <head><title>429 Too Many Requests</title></head>
29Sep 29 02:45:48 alert1001 certspotter[2012]: <body>
30Sep 29 02:45:48 alert1001 certspotter[2012]: <center><h1>429 Too Many Requests</h1></center>
31Sep 29 02:45:48 alert1001 certspotter[2012]: <hr><center>nginx</center>
32Sep 29 02:45:48 alert1001 certspotter[2012]: </body>
33Sep 29 02:45:48 alert1001 certspotter[2012]: </html>
34Sep 29 02:45:48 alert1001 certspotter[2012]: )
35Sep 29 02:45:48 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
36Sep 29 02:45:48 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
37Sep 29 02:58:37 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
38Sep 29 03:16:14 alert1001 certspotter[10759]: 2022/09/29 03:16:14 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
39Sep 29 03:16:14 alert1001 certspotter[10759]: 2022/09/29 03:16:14 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
40Sep 29 03:16:14 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
41Sep 29 03:16:14 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
42Sep 29 03:29:34 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
43Sep 29 03:46:40 alert1001 certspotter[14283]: 2022/09/29 03:46:40 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 429 Too Many Requests (<html>
44Sep 29 03:46:40 alert1001 certspotter[14283]: <head><title>429 Too Many Requests</title></head>
45Sep 29 03:46:40 alert1001 certspotter[14283]: <body>
46Sep 29 03:46:40 alert1001 certspotter[14283]: <center><h1>429 Too Many Requests</h1></center>
47Sep 29 03:46:40 alert1001 certspotter[14283]: <hr><center>nginx</center>
48Sep 29 03:46:40 alert1001 certspotter[14283]: </body>
49Sep 29 03:46:40 alert1001 certspotter[14283]: </html>
50Sep 29 03:46:40 alert1001 certspotter[14283]: )
51Sep 29 03:46:40 alert1001 certspotter[14283]: 2022/09/29 03:46:40 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 429 Too Many Requests (<html>
52Sep 29 03:46:40 alert1001 certspotter[14283]: <head><title>429 Too Many Requests</title></head>
53Sep 29 03:46:40 alert1001 certspotter[14283]: <body>
54Sep 29 03:46:40 alert1001 certspotter[14283]: <center><h1>429 Too Many Requests</h1></center>
55Sep 29 03:46:40 alert1001 certspotter[14283]: <hr><center>nginx</center>
56Sep 29 03:46:40 alert1001 certspotter[14283]: </body>
57Sep 29 03:46:40 alert1001 certspotter[14283]: </html>
58Sep 29 03:46:40 alert1001 certspotter[14283]: )
59Sep 29 03:46:40 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
60Sep 29 03:46:40 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
61Sep 29 03:58:39 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
62Sep 29 04:15:45 alert1001 certspotter[32243]: 2022/09/29 04:15:45 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
63Sep 29 04:15:45 alert1001 certspotter[32243]: 2022/09/29 04:15:45 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
64Sep 29 04:15:45 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
65Sep 29 04:15:45 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
66Sep 29 04:28:33 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
67Sep 29 04:45:46 alert1001 certspotter[4751]: 2022/09/29 04:45:46 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
68Sep 29 04:45:46 alert1001 certspotter[4751]: 2022/09/29 04:45:46 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
69Sep 29 04:45:46 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
70Sep 29 04:45:46 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
71Sep 29 05:16:06 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
72Sep 29 05:37:14 alert1001 certspotter[31323]: 2022/09/29 05:33:27 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 429 Too Many Requests (<html>
73Sep 29 05:37:14 alert1001 certspotter[31323]: <head><title>429 Too Many Requests</title></head>
74Sep 29 05:37:14 alert1001 certspotter[31323]: <body>
75Sep 29 05:37:14 alert1001 certspotter[31323]: <center><h1>429 Too Many Requests</h1></center>
76Sep 29 05:37:14 alert1001 certspotter[31323]: <hr><center>nginx</center>
77Sep 29 05:37:14 alert1001 certspotter[31323]: </body>
78Sep 29 05:37:14 alert1001 certspotter[31323]: </html>
79Sep 29 05:37:14 alert1001 certspotter[31323]: )
80Sep 29 05:37:14 alert1001 certspotter[31323]: 2022/09/29 05:33:27 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 429 Too Many Requests (<html>
81Sep 29 05:37:14 alert1001 certspotter[31323]: <head><title>429 Too Many Requests</title></head>
82Sep 29 05:37:14 alert1001 certspotter[31323]: <body>
83Sep 29 05:37:14 alert1001 certspotter[31323]: <center><h1>429 Too Many Requests</h1></center>
84Sep 29 05:37:14 alert1001 certspotter[31323]: <hr><center>nginx</center>
85Sep 29 05:37:14 alert1001 certspotter[31323]: </body>
86Sep 29 05:37:14 alert1001 certspotter[31323]: </html>
87Sep 29 05:37:14 alert1001 certspotter[31323]: )
88Sep 29 05:37:14 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
89Sep 29 05:37:14 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
90Sep 29 05:59:11 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
91Sep 29 06:19:02 alert1001 certspotter[7405]: 2022/09/29 06:16:20 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
92Sep 29 06:19:02 alert1001 certspotter[7405]: 2022/09/29 06:16:20 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
93Sep 29 06:19:02 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
94Sep 29 06:19:02 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
95Sep 29 06:29:08 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
96Sep 29 06:46:14 alert1001 certspotter[4819]: 2022/09/29 06:46:14 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 429 Too Many Requests (<html>
97Sep 29 06:46:14 alert1001 certspotter[4819]: <head><title>429 Too Many Requests</title></head>
98Sep 29 06:46:14 alert1001 certspotter[4819]: <body>
99Sep 29 06:46:14 alert1001 certspotter[4819]: <center><h1>429 Too Many Requests</h1></center>
100Sep 29 06:46:14 alert1001 certspotter[4819]: <hr><center>nginx</center>
101Sep 29 06:46:14 alert1001 certspotter[4819]: </body>
102Sep 29 06:46:14 alert1001 certspotter[4819]: </html>
103Sep 29 06:46:14 alert1001 certspotter[4819]: )
104Sep 29 06:46:14 alert1001 certspotter[4819]: 2022/09/29 06:46:14 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 429 Too Many Requests (<html>
105Sep 29 06:46:14 alert1001 certspotter[4819]: <head><title>429 Too Many Requests</title></head>
106Sep 29 06:46:14 alert1001 certspotter[4819]: <body>
107Sep 29 06:46:14 alert1001 certspotter[4819]: <center><h1>429 Too Many Requests</h1></center>
108Sep 29 06:46:14 alert1001 certspotter[4819]: <hr><center>nginx</center>
109Sep 29 06:46:14 alert1001 certspotter[4819]: </body>
110Sep 29 06:46:14 alert1001 certspotter[4819]: </html>
111Sep 29 06:46:14 alert1001 certspotter[4819]: )
112Sep 29 06:46:14 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
113Sep 29 06:46:14 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
114Sep 29 06:59:29 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
115Sep 29 07:16:35 alert1001 certspotter[27318]: 2022/09/29 07:16:35 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
116Sep 29 07:16:35 alert1001 certspotter[27318]: 2022/09/29 07:16:35 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
117Sep 29 07:16:35 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
118Sep 29 07:16:35 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
119Sep 29 07:28:54 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
120Sep 29 07:46:00 alert1001 certspotter[21413]: 2022/09/29 07:46:00 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 429 Too Many Requests (<html>
121Sep 29 07:46:00 alert1001 certspotter[21413]: <head><title>429 Too Many Requests</title></head>
122Sep 29 07:46:00 alert1001 certspotter[21413]: <body>
123Sep 29 07:46:00 alert1001 certspotter[21413]: <center><h1>429 Too Many Requests</h1></center>
124Sep 29 07:46:00 alert1001 certspotter[21413]: <hr><center>nginx</center>
125Sep 29 07:46:00 alert1001 certspotter[21413]: </body>
126Sep 29 07:46:00 alert1001 certspotter[21413]: </html>
127Sep 29 07:46:00 alert1001 certspotter[21413]: )
128Sep 29 07:46:00 alert1001 certspotter[21413]: 2022/09/29 07:46:00 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 429 Too Many Requests (<html>
129Sep 29 07:46:00 alert1001 certspotter[21413]: <head><title>429 Too Many Requests</title></head>
130Sep 29 07:46:00 alert1001 certspotter[21413]: <body>
131Sep 29 07:46:00 alert1001 certspotter[21413]: <center><h1>429 Too Many Requests</h1></center>
132Sep 29 07:46:00 alert1001 certspotter[21413]: <hr><center>nginx</center>
133Sep 29 07:46:00 alert1001 certspotter[21413]: </body>
134Sep 29 07:46:00 alert1001 certspotter[21413]: </html>
135Sep 29 07:46:00 alert1001 certspotter[21413]: )
136Sep 29 07:46:00 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
137Sep 29 07:46:00 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
138Sep 29 07:58:44 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
139Sep 29 08:15:51 alert1001 certspotter[25977]: 2022/09/29 08:15:51 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 429 Too Many Requests (<html>
140Sep 29 08:15:51 alert1001 certspotter[25977]: <head><title>429 Too Many Requests</title></head>
141Sep 29 08:15:51 alert1001 certspotter[25977]: <body>
142Sep 29 08:15:51 alert1001 certspotter[25977]: <center><h1>429 Too Many Requests</h1></center>
143Sep 29 08:15:51 alert1001 certspotter[25977]: <hr><center>nginx</center>
144Sep 29 08:15:51 alert1001 certspotter[25977]: </body>
145Sep 29 08:15:51 alert1001 certspotter[25977]: </html>
146Sep 29 08:15:51 alert1001 certspotter[25977]: )
147Sep 29 08:15:51 alert1001 certspotter[25977]: 2022/09/29 08:15:51 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 429 Too Many Requests (<html>
148Sep 29 08:15:51 alert1001 certspotter[25977]: <head><title>429 Too Many Requests</title></head>
149Sep 29 08:15:51 alert1001 certspotter[25977]: <body>
150Sep 29 08:15:51 alert1001 certspotter[25977]: <center><h1>429 Too Many Requests</h1></center>
151Sep 29 08:15:51 alert1001 certspotter[25977]: <hr><center>nginx</center>
152Sep 29 08:15:51 alert1001 certspotter[25977]: </body>
153Sep 29 08:15:51 alert1001 certspotter[25977]: </html>
154Sep 29 08:15:51 alert1001 certspotter[25977]: )
155Sep 29 08:15:51 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
156Sep 29 08:15:51 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
157Sep 29 08:29:08 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
158Sep 29 08:47:08 alert1001 certspotter[650]: 4d849c8866969504c3b4b501ca1485feac0ac5dfcd2237f07c8833a9a8b1a66c:
159Sep 29 08:47:08 alert1001 certspotter[650]: DNS Name = idp-test.wikimedia.org
160Sep 29 08:47:08 alert1001 certspotter[650]: DNS Name = idp-test1002.wikimedia.org
161Sep 29 08:47:08 alert1001 certspotter[650]: DNS Name = idp-test2002.wikimedia.org
162Sep 29 08:47:08 alert1001 certspotter[650]: Pubkey = 5c287e2b8f0eb3afe28a725b213ded3f280c8ec988e87fbf8100c2e2945cdd5b
163Sep 29 08:47:08 alert1001 certspotter[650]: Issuer = C=US, O=Let's Encrypt, CN=R3
164Sep 29 08:47:08 alert1001 certspotter[650]: Not Before = 2022-09-29 07:20:37 +0000 UTC
165Sep 29 08:47:08 alert1001 certspotter[650]: Not After = 2022-12-28 07:20:36 +0000 UTC
166Sep 29 08:47:08 alert1001 certspotter[650]: Log Entry = 483373712 @ https://oak.ct.letsencrypt.org/2022/ (Pre-certificate)
167Sep 29 08:47:08 alert1001 certspotter[650]: crt.sh = https://crt.sh/?sha256=4d849c8866969504c3b4b501ca1485feac0ac5dfcd2237f07c8833a9a8b1a66c
168Sep 29 08:47:08 alert1001 certspotter[650]: Filename = /var/lib/certspotter/state/certs/4d/4d849c8866969504c3b4b501ca1485feac0ac5dfcd2237f07c8833a9a8b1a66c.precert.pem
169Sep 29 08:47:08 alert1001 certspotter[650]: 6f952be28881df4df7647f719e44fee910efa4d0ce3e24d40da7ddb391b10afc:
170Sep 29 08:47:08 alert1001 certspotter[650]: DNS Name = idp-test.wikimedia.org
171Sep 29 08:47:08 alert1001 certspotter[650]: DNS Name = idp-test1002.wikimedia.org
172Sep 29 08:47:08 alert1001 certspotter[650]: DNS Name = idp-test2002.wikimedia.org
173Sep 29 08:47:08 alert1001 certspotter[650]: Pubkey = 5c287e2b8f0eb3afe28a725b213ded3f280c8ec988e87fbf8100c2e2945cdd5b
174Sep 29 08:47:08 alert1001 certspotter[650]: Issuer = C=US, O=Let's Encrypt, CN=R3
175Sep 29 08:47:08 alert1001 certspotter[650]: Not Before = 2022-09-29 07:20:37 +0000 UTC
176Sep 29 08:47:08 alert1001 certspotter[650]: Not After = 2022-12-28 07:20:36 +0000 UTC
177Sep 29 08:47:08 alert1001 certspotter[650]: Log Entry = 1581015614 @ https://ct.googleapis.com/logs/argon2022/ (Certificate)
178Sep 29 08:47:08 alert1001 certspotter[650]: crt.sh = https://crt.sh/?sha256=6f952be28881df4df7647f719e44fee910efa4d0ce3e24d40da7ddb391b10afc
179Sep 29 08:47:08 alert1001 certspotter[650]: Filename = /var/lib/certspotter/state/certs/6f/6f952be28881df4df7647f719e44fee910efa4d0ce3e24d40da7ddb391b10afc.cert.pem
180Sep 29 08:47:08 alert1001 certspotter[650]: 0620f18e8a9b26213a6a73fdbabd7636f8d6c85d2f83f928337d7f2233f678ad:
181Sep 29 08:47:08 alert1001 certspotter[650]: DNS Name = idp-test.wikimedia.org
182Sep 29 08:47:08 alert1001 certspotter[650]: DNS Name = idp-test1002.wikimedia.org
183Sep 29 08:47:08 alert1001 certspotter[650]: DNS Name = idp-test2002.wikimedia.org
184Sep 29 08:47:08 alert1001 certspotter[650]: Pubkey = 1d05561ed39064b4873fa1ef0b745d24989acf9a8d2ba4d93c58b38243c2c432
185Sep 29 08:47:08 alert1001 certspotter[650]: Issuer = C=US, O=Let's Encrypt, CN=R3
186Sep 29 08:47:08 alert1001 certspotter[650]: Not Before = 2022-09-29 07:20:29 +0000 UTC
187Sep 29 08:47:08 alert1001 certspotter[650]: Not After = 2022-12-28 07:20:28 +0000 UTC
188Sep 29 08:47:08 alert1001 certspotter[650]: Log Entry = 1581016930 @ https://ct.googleapis.com/logs/argon2022/ (Pre-certificate)
189Sep 29 08:47:08 alert1001 certspotter[650]: crt.sh = https://crt.sh/?sha256=0620f18e8a9b26213a6a73fdbabd7636f8d6c85d2f83f928337d7f2233f678ad
190Sep 29 08:47:08 alert1001 certspotter[650]: Filename = /var/lib/certspotter/state/certs/06/0620f18e8a9b26213a6a73fdbabd7636f8d6c85d2f83f928337d7f2233f678ad.precert.pem
191Sep 29 08:47:08 alert1001 certspotter[650]: bf97e995ec5b87070c01ed9d35b74c08a512f85d898f816188c4dfd82e2f2522:
192Sep 29 08:47:08 alert1001 certspotter[650]: DNS Name = idp-test.wikimedia.org
193Sep 29 08:47:08 alert1001 certspotter[650]: DNS Name = idp-test1002.wikimedia.org
194Sep 29 08:47:08 alert1001 certspotter[650]: DNS Name = idp-test2002.wikimedia.org
195Sep 29 08:47:08 alert1001 certspotter[650]: Pubkey = 1d05561ed39064b4873fa1ef0b745d24989acf9a8d2ba4d93c58b38243c2c432
196Sep 29 08:47:08 alert1001 certspotter[650]: Issuer = C=US, O=Let's Encrypt, CN=R3
197Sep 29 08:47:08 alert1001 certspotter[650]: Not Before = 2022-09-29 07:20:29 +0000 UTC
198Sep 29 08:47:08 alert1001 certspotter[650]: Not After = 2022-12-28 07:20:28 +0000 UTC
199Sep 29 08:47:08 alert1001 certspotter[650]: Log Entry = 1581017687 @ https://ct.googleapis.com/logs/argon2022/ (Certificate)
200Sep 29 08:47:08 alert1001 certspotter[650]: crt.sh = https://crt.sh/?sha256=bf97e995ec5b87070c01ed9d35b74c08a512f85d898f816188c4dfd82e2f2522
201Sep 29 08:47:08 alert1001 certspotter[650]: Filename = /var/lib/certspotter/state/certs/bf/bf97e995ec5b87070c01ed9d35b74c08a512f85d898f816188c4dfd82e2f2522.cert.pem
202Sep 29 08:47:08 alert1001 certspotter[650]: 2022/09/29 08:46:17 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
203Sep 29 08:47:08 alert1001 certspotter[650]: 2022/09/29 08:46:17 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
204Sep 29 08:47:08 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
205Sep 29 08:47:08 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.
206Sep 29 08:58:51 alert1001 systemd[1]: Started Run certspotter periodically to monitor for issuance of certificates.
207Sep 29 09:15:56 alert1001 certspotter[21750]: 2022/09/29 09:15:56 https://yeti2023.ct.digicert.com/log/: Problem fetching entries 105945412 to 105946406 from log: GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
208Sep 29 09:15:56 alert1001 certspotter[21750]: 2022/09/29 09:15:56 https://yeti2023.ct.digicert.com/log/: Error scanning log (if this error persists, it should be construed as misbehavior by the log): GET https://yeti2023.ct.digicert.com/log/ct/v1/get-entries?start=105945412&end=105946406: 500 INTERNAL SERVER ERROR ()
209Sep 29 09:15:56 alert1001 systemd[1]: certspotter.service: Main process exited, code=exited, status=1/FAILURE
210Sep 29 09:15:56 alert1001 systemd[1]: certspotter.service: Failed with result 'exit-code'.

Related Objects

Event Timeline

Thanks very much for creating this task @fgiunchedi! We were recently discussing certspotter in the team as well and the various issues with it. On one hand, it's an important service that we need to keep running. On the other, there are some issues that actually prevent it from being useful:

  • The 500s are from CT logs being down, or completely removed at all (see 04ee08339). So for these, we can't do much other than retry and silently fail because the logs might have been shut down (which we have to manually discover) or are generally down (which can vary from hours to even days).
  • The 429s are somewhat on us (certspotter), namely that it fetches the logs in parallel and even though we run it once every thirty minutes, there is a lot of data to fetch and many repeated, concurrent requests. I remember our previous discussion on fixing certspotter so that we can limit concurrency (see 771610) and that's one of the things we are considering.
  • Another big issue is the false positives. Right now, we are unable to exclude certificates that we generate (acme-chief, etc.) from certspotter's output because it fetches any and all certificates for the domains we provide it. I think this is the biggest issue because there is real alert fatigue right now.
    • Theoretically, certspotter should only alert if there was a fraudulent certificate issued that was not by us, because in that case, an email alert means things have gone wrong and we need to act. Right now we are getting alerts for all issued certificates and I am not even sure who is going over them carefully. (I only check the issuer and DNS name but that doesn't scale.)
  • I don't think we can ask the CT logs for more quota but it's worth trying. At least on the certspotter issue tracker, it seems like some CT logs are willing to increase their rate limits but I think the fix should come from certspotter itself.

So to summarize, a short term fix can be to delete the misbehaving CT log (https://yeti2023.ct.digicert.com/log/). But a long term fix needs to include all the above for this service to be really effective and now I am wondering if we should just pause it completely, given all these issues!

(Adding @BCornwall to the ticket as he and I might work on it from the Traffic side at some stage.)

Change 836851 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] certspotter: remove rate-limiting CT log

https://gerrit.wikimedia.org/r/836851

Change 836851 merged by Ssingh:

[operations/puppet@production] certspotter: remove rate-limiting CT log

https://gerrit.wikimedia.org/r/836851

So to summarize, a short term fix can be to delete the misbehaving CT log (https://yeti2023.ct.digicert.com/log/). But a long term fix needs to include all the above for this service to be really effective and now I am wondering if we should just pause it completely, given all these issues!

Thank you @ssingh for the extensive explanation and context (and the quick fix/bandaid), all makes sense to me! (my two cents) I peered at certspotter's issues and commits and it didn't seem quite active as a project, at least lately. It is understandable to re-evaluate it especially because of the alert fatigue as you pointed out.

Change 866499 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] certspotter: temporarily disable certspotter (and the systemd timer)

https://gerrit.wikimedia.org/r/866499

Change 866499 merged by Ssingh:

[operations/puppet@production] certspotter: temporarily disable certspotter (and the systemd timer)

https://gerrit.wikimedia.org/r/866499

herron claimed this task.
herron subscribed.

I'm reviewing the backlog today (almost exactly one year since the last update!) and I think we're ok to close this since certspotter failures were addressed, and we can re-evaluate if/when ready in a new task. Please reopen if I'm wrong about that