Page MenuHomePhabricator

LiftWing fiwiki-damaging model returning 500
Open, Needs TriagePublicBUG REPORT

Description

500
{"error":"An error happened while fetching feature values from the MediaWiki API, please contact the ML-Team if the issue persists."}

Getting this error ~1-5 times per day (sometimes there's a day when I don't get this error). Possible first encounter in 2024-01-05 after migration to LiftWing from ORES. In 2024 this error was rare.
The current occurrence frequency may have started in 2025-04-07, I'm not 100 % certain it's the same error because the log from that time doesn't have same level of detail.

User agent:

'User-Agent': 'stabilizerbot (https://wikitech.wikimedia.org/wiki/User:4shadoww)'

Event Timeline

I've looked through our Logstash hunting for 500 errors for fiwiki-damaging in the last month. Indeed in the last month, we had 13 days where those errors occured, ranging from 4 to 72 occurrences on those days. All of those are caused by LiftWing failing to fetch data from MW API due to 503 Service Unavailable error:

ERROR:root:An error has occurred while fetching feature values from the MW API: 503, message='Service Unavailable', url=URL('http://fi.wikipedia.org:80/w/api.php?action=query&prop=revisions&revids=23623609&rvslots=main&rvprop=timestamp%7Cuser%7Csize%7Cids%7Cuserid%7Ccontent%7Ccomment%7Ccontentmodel&format=json')

Looking at our fiwiki-damaging code, we currently do not re-try the MW API queries on failure so I'd recommend introducing the re-try mechanism there, which should already help with this issue. I'll create a Phabricator task to track adding the re-try mechanism to our models. FYI @isarantopoulos