Page MenuHomePhabricator

cp1066 unexplained 503 spikes
Closed, DuplicatePublic

Description

We had a series of notable 503 spikes today: https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=2&fullscreen&orgId=1&var-site=All&var-cache_type=text&var-status_type=5&from=1504821102469&to=1504827498790 . In the 5xx logs, they all had x_cache lines implicating cp1066 as backend-most cache.

I've depooled the node from all services at 23:42.

I haven't found any solid lead yet on exactly what is going wrong there. It could be a host problem, or it could be a URL-specific problem that chashed to cp1066 (in which case this will probably recur shortly and implicate a different node).

Related Objects

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Going to repool this today on the assumption it was genuinely part of T175803