This alert paged this morning at 6:58 UTC.
FIRING: ToolforgeWebHighErrorRate: High 5xx rate on Toolforge web services #page - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeWebHighErrorRate - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/infra-k8s-haproxy?var-frontend=k8s-ingress-https - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeWebHighErrorRate
[07:05:42] * dhinus paged ToolforgeWebHighErrorRate: High 5xx rate on Toolforge [07:07:45] <dhinus> the metric peaked at 45% and is going down, now it's around 30% [07:14:25] <dhinus> the grafana haproxy dashboard shows a spike in sessions, but it's well below the 2k limit [07:29:55] <dhinus> interesting: the frontend shows a decrease in 2xx and a corresponding increase in 5xx, whereas the backend shows only a _decrease_ in 2xx responses, but not a corresponding increase in 5xx.





