Page MenuHomePhabricator

api_appserver Average latency exceeded alert fired late when latency was declining again
Open, Needs TriagePublic

Description

Towards the end of the incident on 2023-04-17, the following alert fired when latency was in fact decreasing again:

<jinxer-wm> (MediaWikiLatencyExceeded) firing: Average latency high: eqiad api_appserver GET/200 - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=eqiad%20prometheus/ops&var-cluster=api_appserver&var-method=GET - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded

Why? Was this a defect in alerting, some component that needs tuning, or something else?