Page MenuHomePhabricator

wdqs updater failed on error response from Blazegraph
Closed, ResolvedPublic

Description

wdqs200[56] failed to update when encountering invalid JSON response from Blazegraph: https://logstash.wikimedia.org/goto/6bf10387447fed74b0b86093196d21c8 and did not recover. This is related to T192768 but different.

Event Timeline

Gehel created this task.Apr 23 2018, 4:09 PM
Restricted Application added projects: Wikidata, Discovery. · View Herald TranscriptApr 23 2018, 4:09 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Smalyshev renamed this task from wdqs updater failed on invalid json to wdqs updater failed on error response from Blazegraph.Apr 23 2018, 7:05 PM
Smalyshev updated the task description. (Show Details)

Here the solution is not clear. If Blazegraph is having problems, Updater can't do much about it. Retrying in this case probably won't help much as if Blazegraph is stuck on some OOM problem, it will remain stuck at least short term (within our retry horizon). I could see whether we can make bad JSON follow the same path as connection error, etc. but ultimately if Blazegraph does not recover very quickly, there's nothing Updater can do...

Gehel closed this task as Resolved.Apr 24 2018, 8:28 AM
Gehel claimed this task.

It looks like fixing T192768 would also solve this case. If the updater exit cleanly, it should be restarted by systemd and eventually recover (if blazegraph recovers).

Closing this for now.

Vvjjkkii renamed this task from wdqs updater failed on error response from Blazegraph to dfeaaaaaaa.Jul 1 2018, 1:14 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed Gehel as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
AntiCompositeNumber renamed this task from dfeaaaaaaa to wdqs updater failed on error response from Blazegraph.Jul 1 2018, 4:10 AM
AntiCompositeNumber closed this task as Resolved.
AntiCompositeNumber assigned this task to Gehel.
AntiCompositeNumber updated the task description. (Show Details)
AntiCompositeNumber added a subscriber: Aklapper.
CommunityTechBot raised the priority of this task from High to Needs Triage.Jul 5 2018, 6:31 PM
CommunityTechBot updated the task description. (Show Details)