Page MenuHomePhabricator

eqiad squid performances issue
Closed, ResolvedPublic

Description

Originally guessed in T245121, more visibility has been added with T245176.

I created a quick dashboard: https://grafana.wikimedia.org/d/i5YA-BXWz/squid?orgId=1
But it looks like the Prometheus exporter in eqiad is often taking a lot of time to reply (eg 90s).

ayounsi@prometheus1003:~$ time curl install1003.wikimedia.org:9301/metrics -s | grep _up
# HELP squid_up Was the last query of squid successful?
# TYPE squid_up gauge
squid_up{host="localhost"} 1

real	1m34.694s
user	0m0.012s
sys	0m0.008s

The same issue doesn't happen in codfw.

The amount of requests is quite small ~8rps. So I'd think there is a miss-configuration somewhere?

Event Timeline

ayounsi triaged this task as High priority.Mar 16 2020, 2:49 PM
ayounsi created this task.

Change 580296 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] squid3: bump max open file descriptors

https://gerrit.wikimedia.org/r/580296

I've bumped the limits for squid on install1003 and things look good now, the permanent fix is in https://gerrit.wikimedia.org/r/580296

When building a docker container on contint1001.wikimedia.org with docker-pkg, pip gets proxy timeout error when using http://webproxy.eqiad.wmnet:8080.

I have manually switched to the codfw one (http://webproxy.codfw.wmnet:8080) and it worked fine.

So I guess install1003.wikimedia.org has an issue of some sort?

I have triggered a build for that container and this time it worked all fine. So it seems install1003 Squid now behave properly :) Thank you!

Change 580296 merged by Filippo Giunchedi:
[operations/puppet@production] squid3: bump max open file descriptors

https://gerrit.wikimedia.org/r/580296

fgiunchedi claimed this task.

Fix is deployed, looking good!