There are some requests failing going to upload (specifically maps) caches. It seems to be request-related, and cpu is being maximized: https://grafana.wikimedia.org/d/000000607/cluster-overview?orgId=1&var-datasource=eqiad%20prometheus%2Fops&var-cluster=maps&var-instance=All&from=1568307201093&to=1568364497936
Description
Details
Related Objects
- Mentioned In
- T232819: kartotherian leaving stray files in /tmp
Event Timeline
Mentioned in SAL (#wikimedia-operations) [2019-09-13T09:24:16Z] <gehel> deny access to /geoline on maps1004 - T232817
Mentioned in SAL (#wikimedia-operations) [2019-09-13T09:27:40Z] <gehel> restart kartotherian on maps1004 - T232817
Mentioned in SAL (#wikimedia-operations) [2019-09-13T09:38:32Z] <gehel> drop /geoshape and restart kartotherian on maps1004 - T232817
Mentioned in SAL (#wikimedia-operations) [2019-09-13T09:46:01Z] <gehel> re-enabling /geoline on maps1004 - T232817
Change 536641 had a related patch set uploaded (by MSantos; owner: MSantos):
[mediawiki/services/kartotherian@master] Fix kartotherian server error handling
Change 536641 merged by jenkins-bot:
[mediawiki/services/kartotherian@master] Fix kartotherian server error handling
Mentioned in SAL (#wikimedia-operations) [2019-09-13T23:06:21Z] <gehel> re-enable puppet on maps - T232817
FYI the incident documentation is https://wikitech.wikimedia.org/wiki/Incident_documentation/20190913-maps
Change 545723 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] Maps: remove varnish URI sanitization for maps (now done in Kartotherian)
Change 545723 merged by Ema:
[operations/puppet@production] Maps: remove varnish URI sanitization for maps (now done in Kartotherian)