Page MenuHomePhabricator

Mobileapps flapping since 2019-11-26 0:00 UTC
Closed, ResolvedPublic

Description

Mobileapps endpoint checks are frequently timing out since 2019-11-26 0:00 UTC. Grafana doesn't show anything out of the ordinary, and unlike in past cases of this behavior (e.g., T229286), there is no sign of worker deaths in logstash. This needs further investigation and remediation.

Event Timeline

Looks like the timeouts are occurring on requests to https://api-rw.discovery.wmnet/w/api.php — note the https://.

Change 553359 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[mediawiki/services/mobileapps/deploy@master] Use http, not https, in mwapi_uri

https://gerrit.wikimedia.org/r/553359

LGoto triaged this task as Medium priority.Nov 27 2019, 4:32 PM

Change 553359 merged by jenkins-bot:
[mediawiki/services/mobileapps/deploy@master] Use http, not https, in mwapi_uri

https://gerrit.wikimedia.org/r/553359

Mholloway renamed this task from Mobileapps flapping on scb2005 since 2019-11-26 0:00 UTC to Mobileapps flapping since 2019-11-26 0:00 UTC.Dec 3 2019, 6:23 PM
Mholloway updated the task description. (Show Details)

Change 554557 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[mediawiki/services/mobileapps@master] Add logging for MW API request timeouts

https://gerrit.wikimedia.org/r/554557

Change 554557 merged by jenkins-bot:
[mediawiki/services/mobileapps@master] Add logging for MW API request timeouts

https://gerrit.wikimedia.org/r/554557

We have been seeing instances of this issue on codfw (soft mobileapp endpoint timeout alerts) specifically in the last several weeks. I can file a new task if this is not the right place to followup on this.

@jcrespo If it's still the case that you're seeing soft mobileapps endpoint timeout alerts on codfw, IMO a new task would be best.

Per T238832#5736879, this round of instability seems to have been resolved when Parsoid/JS linting was turned off.