Page MenuHomePhabricator

Track global and per-tool concurrent requests and 503 rate limiting responses from Toolforge CDN edge
Open, HighPublicFeature

Description

@taavi do we have any tracking that will let us see how often the concurrency limit is being tripped broken out by tool? I think the tool-dashboard for geohack gets data from the Kubernetes cluster which would not know when the layer above it's ingress has returned a 503 response to the client.

Not directly, unfortunately. One could compare the general frontend and backend error rate metrics to see the number of requests rejected on the HAProxy layer, but that is not split per tool. I did check whether T343885: [promethus,haproxy] Move to haproxy internal metrics from haproxy_exporter would help with it, but unfortunately that doesn't seem to be the case.

Rate limiting at the haproxy edge is benefit for Toolforge generally, but not having metrics on what "normal" traffic looks like and how much traffic we are sending away with concurrency limits makes reasoning about changes to the limits and new traffic overload controls difficult.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
taavi triaged this task as High priority.Nov 12 2025, 3:03 PM
bd808 renamed this task from Track global and per-tool concurrent requests and 503 rate limitiing responses from Toolforge CDN edge to Track global and per-tool concurrent requests and 503 rate limiting responses from Toolforge CDN edge.Nov 20 2025, 11:07 PM