Page MenuHomePhabricator

mwapi calls rarely return results
Closed, ResolvedPublic8 Estimated Story PointsBUG REPORT

Description

Steps to Reproduce:

  1. Head to https://query.wikidata.org/ and open the "Filter labels using EntitySearch from mwapi service to provide Full Text Search" example.
  2. Use the "Code" button to generate an URL for this query. Add &format=json at the end to allow viewing the result in a web browser. You should arrive at this URL.
  3. Spam CTRL + F5 to force refresh the page a bunch of times (usually at least 10) and take note of the results.

Actual Results:

Most of the time, the results will be empty. Sometimes, however, the correct results are returned.

Expected Results:

The results are always correctly populated.

Event Timeline

Restricted Application added a project: Wikidata. · View Herald TranscriptSep 27 2020, 5:20 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
CBogen set the point value for this task to 8.Sep 28 2020, 5:26 PM
abian awarded a token.Sep 30 2020, 8:05 PM
abian added a subscriber: abian.
Kdutia added a subscriber: Kdutia.Oct 1 2020, 8:47 AM

Change 631492 had a related patch set uploaded (by DCausse; owner: DCausse):
[wikidata/query/rdf@master] Add more debug logging in MWApiServiceCall

https://gerrit.wikimedia.org/r/631492

Change 631492 merged by jenkins-bot:
[wikidata/query/rdf@master] Add more debug logging in MWApiServiceCall

https://gerrit.wikimedia.org/r/631492

Zache added a subscriber: Zache.Oct 1 2020, 5:56 PM

The root cause of the problem is yet unclear.
Added some more debug logs to continue investigating.
What I know so far is that only codfw was affected and restarting blazegraph on an affected node fixed the issue. A state is probably leaked but it's unclear where yet, could be in blazegraph itself or in the jetty http client (the additional logging should hopefully help to discard one option or another).

Mentioned in SAL (#wikimedia-operations) [2020-10-21T14:34:30Z] <dcausse> restarting blazegraph on codfw servers (T263952)

resuming investigation, additional logs seem to suggest that the jetty http client (or the way we use it) is to blame.

Change 636032 had a related patch set uploaded (by DCausse; owner: DCausse):
[wikidata/query/rdf@master] Better resource handling for mwapi calls

https://gerrit.wikimedia.org/r/636032

Change 636032 merged by jenkins-bot:
[wikidata/query/rdf@master] Better resource handling for mwapi calls

https://gerrit.wikimedia.org/r/636032

Gehel closed this task as Resolved.Mon, Nov 9, 12:57 PM
Gehel added a subscriber: Gehel.

This should be resolved, but hard to reproduce, so hard to test. Feel free to reopen if the error occurs again.