I presume this is because webperf2001 is currently the primary?
Description
Description
Event Timeline
Comment Actions
Confirmed:
dpifke@webperf1001:~$ curl -s http://localhost:9230/metrics | grep latest # HELP webperf_latest_handled_time_seconds UNIX timestamp of most recent message # TYPE webperf_latest_handled_time_seconds gauge webperf_latest_handled_time_seconds{schema="SaveTiming"} 1598968983.279599 webperf_latest_handled_time_seconds{schema="QuickSurveysResponses"} 1598968992.488641 webperf_latest_handled_time_seconds{schema="FirstInputTiming"} 1598968994.442974 webperf_latest_handled_time_seconds{schema="QuickSurveyInitiation"} 1598968995.142616 webperf_latest_handled_time_seconds{schema="NavigationTiming"} 1598968995.179199 webperf_latest_handled_time_seconds{schema="PaintTiming"} 1598968995.19738 dpifke@webperf1001:~$ date -d '@1598968995.179199' Tue Sep 1 14:03:15 UTC 2020
This is arguably working as intended, and the correct action here is to simply silence the alert(s) as part of the switchover. I have to think about if there's a clean way to do so automatically.
It's not clear to me why this didn't trigger in codfw when the alert was first created.