Page MenuHomePhabricator

ORES extension score only main namespace edits for Wikidata
Closed, ResolvedPublicPRODUCTION ERROR

Description

Model contains an error: ValueError: Failed to process datasource.wikibase.revision.parent.item_doc: Expecting value: line 1 column 1 (char 0)
 Traceback (most recent call last): File "/srv/deployment/ores/venv/lib/python3.4/site-packages/revscoring/dependencies/functions.py", line 244,
 in _solve value = dependent(*args) File "/srv/deployment/ores/venv/lib/python3.4/site-packages/revscoring/dependencies/dependent.py", line 52,
 in __call__ return self.process(*args, **kwargs) File "/srv/deployment/ores/venv/lib/python3.4/site-packages/revscoring/features/wikibase/datasources/revision_oriented.py", line 108,
 in _process_item_doc return json.loads(text) File "/usr/lib/python3.4/json/__init__.py", line 318,
 in loads return _default_decoder.decode(s) File "/usr/lib/python3.4/json/decoder.py", line 343,
 in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python3.4/json/decoder.py", line 361,
 in raw_decode raise ValueError(errmsg("Expecting value", s, err.value)) from None ValueError: Expecting value: line 1 column 1 (char 0)

Event Timeline

Uh, what log did this come from? I don't think this is related to the MediaWiki extension, but ORES itself...

With a closer look to the error. It's clear what's wrong. It occurs when the ORES extension tries to score edits not made in the main name space in Wikidata (ORES service doesn't support that and it's by design). The fastest solution is making the extension score only main namespace edits for Wikidata.

Ladsgroup renamed this task from Model contains an error: ValueError: Failed to process datasource.wikibase.revision.parent.item_doc: Expecting value to ORES extension score only main namespace edits for Wikidata.Jul 13 2016, 9:21 AM
Ladsgroup triaged this task as Low priority.
Ladsgroup edited projects, added MediaWiki-extensions-ORES, Wikidata; removed ORES.

Change 298948 had a related patch set uploaded (by Ladsgroup):
Let ORES extension score for some namespaces instead of all

https://gerrit.wikimedia.org/r/298948

It'd be good to also score the edits on properties. Is the property namespace included still?

It'd be good to also score the edits on properties. Is the property namespace included still?

The patch here only enables this method, we can add as many namespaces as we want in mediawiki-config patches. Once this one is merged, I will make the config one and put both item and properties namespaces :)

It occurs when the ORES extension tries to score edits not made in the main name space in Wikidata (ORES service doesn't support that and it's by design).

Tangentially, can you explain why that's the case?

It occurs when the ORES extension tries to score edits not made in the main name space in Wikidata (ORES service doesn't support that and it's by design).

Tangentially, can you explain why that's the case?

Mostly because the models is meant to deal with wikibase data model not with the textual and wiki-like part. So when it tries to understand an edit that not made in main namespace (and ns:120, properties) it errors trying to parse the json since these pages are not json-parse-able.

Change 298948 merged by jenkins-bot:
Let ORES extension score for some namespaces instead of all

https://gerrit.wikimedia.org/r/298948

Change 300083 had a related patch set uploaded (by Ladsgroup):
Let ORES extension score for some namespaces instead of all

https://gerrit.wikimedia.org/r/300083

Change 300086 had a related patch set uploaded (by Ladsgroup):
ORES score edits in main and Property namespaces in wikidatawiki

https://gerrit.wikimedia.org/r/300086

Change 300083 merged by jenkins-bot:
Let ORES extension score for some namespaces instead of all

https://gerrit.wikimedia.org/r/300083

Change 300086 merged by jenkins-bot:
ORES score edits in main and Property namespaces in wikidatawiki

https://gerrit.wikimedia.org/r/300086

It got deployed in June 20th, 23:40 UTC. After that time, we had 21 failed jobs again. Which my guess is that they are old jobs which got failed and they are retrying. For example yesterday, in the exact time span we had 71 failed jobs. I will keep monitoring to see if the failed jobs diminishes or not.

Link to logstash if you want to check Looong link

In the past 6 hours we had only 33 cases comparing to 227 cases yesterday, the same time span.

jobs should stop retrying after ~2-3 hours.

https://grafana-admin.wikimedia.org/dashboard/db/ores-extension shows this patch reduced failure rate of ORES extension jobs from 3.7% to 1.2%. Other reasons for failure is timeouts. I need investigate that

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:11 PM