Page MenuHomePhabricator

Wikimedia maps instability (maps.wikimedia.org)
Closed, ResolvedPublic

Description

There are some requests failing going to upload (specifically maps) caches. It seems to be request-related, and cpu is being maximized: https://grafana.wikimedia.org/d/000000607/cluster-overview?orgId=1&var-datasource=eqiad%20prometheus%2Fops&var-cluster=maps&var-instance=All&from=1568307201093&to=1568364497936

Event Timeline

jcrespo created this task.Sep 13 2019, 9:20 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 13 2019, 9:20 AM

Mentioned in SAL (#wikimedia-operations) [2019-09-13T09:24:16Z] <gehel> deny access to /geoline on maps1004 - T232817

Mentioned in SAL (#wikimedia-operations) [2019-09-13T09:27:40Z] <gehel> restart kartotherian on maps1004 - T232817

jcrespo renamed this task from Wikimedia maps unstability (maps.wikimedia.org) to Wikimedia maps instability (maps.wikimedia.org).Sep 13 2019, 9:28 AM

Mentioned in SAL (#wikimedia-operations) [2019-09-13T09:38:32Z] <gehel> drop /geoshape and restart kartotherian on maps1004 - T232817

Mentioned in SAL (#wikimedia-operations) [2019-09-13T09:46:01Z] <gehel> re-enabling /geoline on maps1004 - T232817

jcrespo removed a subscriber: jcrespo.Sep 13 2019, 10:06 AM

Change 536641 had a related patch set uploaded (by MSantos; owner: MSantos):
[mediawiki/services/kartotherian@master] Fix kartotherian server error handling

https://gerrit.wikimedia.org/r/536641

Change 536641 merged by jenkins-bot:
[mediawiki/services/kartotherian@master] Fix kartotherian server error handling

https://gerrit.wikimedia.org/r/536641

Mentioned in SAL (#wikimedia-operations) [2019-09-13T23:06:21Z] <gehel> re-enable puppet on maps - T232817

jbond added a subscriber: jbond.Sep 16 2019, 10:32 AM
Jhernandez triaged this task as High priority.
Jhernandez added a subscriber: Jhernandez.

Hey @MSantos, this seems fixed AFAIK, resolve the task if it is. Thanks!