Page MenuHomePhabricator

syslog::centralserver: TLS cert only valid for centrallog1002 but centrallog1001 is checked
Closed, ResolvedPublic

Description

Noticed in logstash when looking at other monitoring that fails:

Within the last 2 days there were about 45k errors when monitoring attemps to check centrallog1001 but gets a TLS cert that is only valid for centralllog1002.

x509: certificate is valid for centrallog1002.eqiad.wmnet, not centrallog1001.eqiad.wmnet

So either the cert should have all the names or monitoring should stop checking the old host, I suppose.

Event Timeline

I think you are correct in the sense that this should be fixed once centrallog1001 is decom, cc @andrea.denisse

Hi Daniel, Filippo is correct. Progress of the decommission is tracked on T328803.
The failover of centrallog1001 -> centrallog1002 is planned for next week. Meanwhile I'll look for a way to silence the alert.
Thanks.

@andrea.denisse Don't worry about silences. I was just looking specifically at a dashboard to see only failed attempts from blackbox::http because I was debugging my own alerts and this stood out. I was not notified by anything. As long as it's known and work in progress all is good and you might as well merge it into your general decom ticket if you like.

Dzahn triaged this task as Low priority.Feb 23 2023, 8:57 PM