Steps to replicate the issue (include links if applicable):
What happens?:
| Iniquity | |
| Sep 12 2025, 7:10 PM |
| F66018640: image.png | |
| Sep 12 2025, 7:33 PM |
| F66018643: image.png | |
| Sep 12 2025, 7:28 PM |
| F66018633: image.png | |
| Sep 12 2025, 7:20 PM |
| F66018602: image.png | |
| Sep 12 2025, 7:10 PM |
Steps to replicate the issue (include links if applicable):
What happens?:
I cannot reproduce. The tool's home page successfully loads and so does the example linked there.
It seems it's not logging anything since the 8th of September:
root@tools-k8s-haproxy-5:~# journalctl -f -u haproxy.service Sep 08 08:31:42 tools-k8s-haproxy-5 haproxy[1720669]: [WARNING] (1720669) : Server k8s-api/tools-k8s-control-8.tools.eqiad1.wikimedia.cloud is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 2ms. 2 active and 0 backup servers left. 93 sessions active, 0 requeued, 0 remaining in queue. Sep 08 08:32:23 tools-k8s-haproxy-5 haproxy[1720669]: [WARNING] (1720669) : Server k8s-api/tools-k8s-control-8.tools.eqiad1.wikimedia.cloud is UP, reason: Layer7 check passed, code: 200, check duration: 7ms. 3 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Sep 08 08:33:04 tools-k8s-haproxy-5 haproxy[1720669]: [WARNING] (1720669) : Server k8s-api/tools-k8s-control-8.tools.eqiad1.wikimedia.cloud is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 1ms. 2 active and 0 backup servers left. 1 sessions active, 0 requeued, 0 remaining in queue. Sep 08 08:33:37 tools-k8s-haproxy-5 haproxy[1720669]: [WARNING] (1720669) : Server k8s-api/tools-k8s-control-8.tools.eqiad1.wikimedia.cloud is UP, reason: Layer7 check passed, code: 200, check duration: 20ms. 3 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Sep 08 08:34:37 tools-k8s-haproxy-5 haproxy[1720669]: [WARNING] (1720669) : Server k8s-api/tools-k8s-control-8.tools.eqiad1.wikimedia.cloud is UP. 3 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Sep 08 08:39:43 tools-k8s-haproxy-5 haproxy[1720669]: [WARNING] (1720669) : Server k8s-api/tools-k8s-control-9.tools.eqiad1.wikimedia.cloud is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 1ms. 2 active and 0 backup servers left. 139 sessions active, 0 requeued, 0 remaining in queue. Sep 08 08:40:29 tools-k8s-haproxy-5 haproxy[1720669]: [WARNING] (1720669) : Server k8s-api/tools-k8s-control-9.tools.eqiad1.wikimedia.cloud is UP, reason: Layer7 check passed, code: 200, check duration: 13ms. 3 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Sep 08 08:41:19 tools-k8s-haproxy-5 haproxy[1720669]: [WARNING] (1720669) : Server k8s-api/tools-k8s-control-9.tools.eqiad1.wikimedia.cloud is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 1ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. Sep 08 08:41:50 tools-k8s-haproxy-5 haproxy[1720669]: [WARNING] (1720669) : Server k8s-api/tools-k8s-control-9.tools.eqiad1.wikimedia.cloud is UP, reason: Layer7 check passed, code: 200, check duration: 14ms. 3 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Sep 08 08:42:50 tools-k8s-haproxy-5 haproxy[1720669]: [WARNING] (1720669) : Server k8s-api/tools-k8s-control-9.tools.eqiad1.wikimedia.cloud is UP. 3 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
Just restarted it, and it seems to be coming up, will let it startup for a bit.
tools-k8s-haproxy-6 logs for haproxy services stopped also on sep 8th, just restarted haproxy there also
It seems like we are hitting the HAProxy session limit for ingresses:
Change #1187892 had a related patch set uploaded (by Majavah; author: Majavah):
[operations/puppet@production] P:toolforge::proxy: Limit in-flight connections per tool
I think it might be geohack getting most the connections:
root@tools-proxy-9:~# tail -n 10000 /var/log/nginx/access.log | awk '{print $1}' | sort | uniq -c | sort -h | tail
130 hub.toolforge.org
140 freebase.toolforge.org
198 tabernacle.toolforge.org
253 glamtools.toolforge.org
336 sigma.toolforge.org
341 reasonator.toolforge.org
359 wikimap.toolforge.org
426 guc.toolforge.org
604 panoviewer.toolforge.org
4603 geohack.toolforge.orgChange #1187892 merged by Majavah:
[operations/puppet@production] P:toolforge::proxy: Limit in-flight connections per tool
@taavi's patch is working as expected, geohack is getting throttled, and the rest of tools started working as expected.
Change #1187894 had a related patch set uploaded (by Majavah; author: Majavah):
[operations/puppet@production] P:toolforge::proxy: Allow throttled tools to load error page assets
Change #1187894 merged by Majavah:
[operations/puppet@production] P:toolforge::proxy: Allow throttled tools to load error page assets