Page MenuHomePhabricator

Update certspotter
Open, Needs TriagePublic

Description

In January 2017 we started using certspotter (T155807: Monitor Certificate Transparency (CT) logs) to monitor CT logs for things being issued for our domains.
At some point, CT servers started failing a lot and this generated cronspam (T162327: certspotter on einsteinium has issues talking to external, T159137: certspotter: Error retrieving STH from log).
Eventually the cron was disabled in https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/428367/
Unfortunately the list of CT servers was hardcoded. Newer certspotter versions have apparently fixed this, so we should update to a new version that doesn't have the issue and re-enable.
(opened per discussion in #wikimedia-traffic earlier)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Adding the Debian maintainer :-) This seems fixed in 0.9-1 so updating stretch-backports to 0.9 could fix this.

Adding the Debian maintainer :-) This seems fixed in 0.9-1 so updating stretch-backports to 0.9 could fix this.

This is now done :)

So now we just pin the certspotter package to release a=stretch-backports?

Actually it looks like it wasn't in stretch to stretch-backports has highest priority anyway. So the host just needs the package upgraded..?

einstenium and tegmen still run jessie and I didn't build a version for jessie-wikimedia. I believe they're being migrated to stretch as we speak, so maybe we should just wait for that.

The Icinga servers in production are now running 0.9-1~bpo9+1, but the Cron job still needs to be re-instated.

Change 475453 had a related patch set uploaded (by Alex Monk; owner: Alex Monk):
[operations/puppet@production] Revert "certspotter: temporarily disable cron job"

https://gerrit.wikimedia.org/r/475453

What is the status of this nowadays? I ran across it in a different matter while looking for absented crons and found the TODO and link in code that points over here and to the pending patch above. It's been some time since November 2018. Can we just re-enable it or is there more to it?

Change 475453 had a related patch set uploaded (by Dzahn; owner: Alex Monk):
[operations/puppet@production] Revert "certspotter: temporarily disable cron job"

https://gerrit.wikimedia.org/r/475453

Change 475453 merged by Dzahn:
[operations/puppet@production] Revert "certspotter: temporarily disable cron job"

https://gerrit.wikimedia.org/r/475453

on icinga1001, alert1001: crons reactivated:

Notice: /Stage[main]/Certspotter/Cron[certspotter]/ensure: created
Dzahn claimed this task.

What I know here is:

  • meanwhile icinga servers are on buster
  • reviewers approved reverting the "disable cron job"
  • I merged it, crons are enabled again and announced it on IRC
  • on icinga1001 there is certspotter 0.9-1~bpo9+1
  • this ticket is from 2018

So based on that I am now assuming we can call it resolved.

After seeing quite a bit of cronspam from this, I reverted the change again and reopening this.

The list of servers needs to be updated it seems. There are "misbehaving" and non-existing servers. f.e.:

Get https://ct.ws.symantec.com/ct/v1/get-sth: dial tcp: lookup ct.ws.symantec.com on 10.3.0.1:53: server misbehaving

Get https://ctlog-gen2.api.venafi.com/ct/v1/get-sth: dial tcp: lookup ctlog-gen2.api.venafi.com on 10.3.0.1:53: server misbehaving

 Get https://ct2.digicert-ct.com/log/ct/v1/get-sth: dial tcp: lookup ct2.digicert-ct.com on 10.3.0.1:53: no such host

see details in root mail from icinga1001 of today, Nov 18th 2020.

Dzahn removed Dzahn as the assignee of this task.Nov 19 2020, 12:00 AM
BBlack subscribed.

The swap of Traffic for Traffic-Icebox in this ticket's set of tags was based on a bulk action for all such tickets that haven't been updated in 6 months or more. This does not imply any human judgement about the validity or importance of the task, and is simply the first step in a larger task cleanup effort. Further manual triage and/or requests for updates will happen this month for all such tickets. For more detail, have a look at the extended explanation on the main page of Traffic-Icebox . Thank you!

Change 768065 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] certspotter: update package and replace cron with systemd timer

https://gerrit.wikimedia.org/r/768065

Mentioned in SAL (#wikimedia-operations) [2022-03-10T15:33:23Z] <sukhe> upload certspotter 0.10-1wm1 to apt.wm.o - T204993

Change 768065 merged by Ssingh:

[operations/puppet@production] certspotter: update package and replace cron with systemd timer

https://gerrit.wikimedia.org/r/768065

Change 776217 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] certspotter: switch to a local CT logs list

https://gerrit.wikimedia.org/r/776217

Change 776217 merged by Ssingh:

[operations/puppet@production] certspotter: switch to a local CT logs list

https://gerrit.wikimedia.org/r/776217

BCornwall raised the priority of this task from Medium to Needs Triage.Mar 30 2023, 8:58 PM
BCornwall removed a project: SRE.
ssingh subscribed.