Add ORES article quality predictions to the WDQS
Closed, DuplicatePublic
Actions

Assigned To

None

Authored By

	Halfak
	Jul 7 2020, 4:29 PM

Description

From @Spinster

We have started (experimentally) tracking content progress of the Dutch (multilingual) WikiProject [https://nl.wikipedia.org/wiki/Wikipedia:Wikiproject/Wiki_goes_Caribbean Wiki goes Caribbean] via Wikidata. Can ORES article quality for supported Wikipedia languages be added to this process of assessment, and if so, what would be the best way to get there?
Topics related to the WikiProject are (manually) tracked via a Wikidata P5008 statement (query: https://w.wiki/WJW )
Do note: this set of topics is dynamic - Wikidata items can be added or removed as the project progresses
We've started (experimentally) tracking coverage of these topics on various relevant Wikipedias and on Commons, see [https://docs.google.com/spreadsheets/d/1c_RYfqwPGuRiO2MJ38iO5_Ibg6c2MzfasbjnrfBdY6A/edit#gid=0 this spreadsheet] which is used for measuring coverage progress over time on nlwiki, enwiki, papwiki, eswiki and Commons.
Question: can average ORES article quality for these topics on (at least) English Wikipedia, and later also Dutch Wikipedia, be included in measurements as well?
Would this produce 'meaningful' numbers/scores that are 'legible'/'interpretable' by laypeople (with some explanation if needed) and indeed indicate general quality development over time?
If so, are there already (non-coder friendly) tools with which relevant ORES article quality scores can be retrieved for a given set of Wikidata items / a Wikidata query / a set of Wikipedia articles?
If not, does it make sense to e.g. submit a feature request for tools like PetScan to provide ORES article quality scores as output?
All other tips and input very welcome.

Related Objects

Mentioned Here: P5008 (An Untitled Masterwork)

Event Timeline

Halfak created this task.Jul 7 2020, 4:29 PM

Restricted Application added projects: Wikidata, artificial-intelligence. · View Herald TranscriptJul 7 2020, 4:29 PM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

We already store article quality predictions in the ores_classification table on the wikis where we have support.

We store some prediction in Elastic Search related to topic (see the "articletopic:foo" keyword). I'm not sure about how much Elastic Search and WDQS infra overlap, but that might be relevant.

Chtnnh subscribed.Jul 7 2020, 4:34 PM

Spinster updated the task description. (Show Details)Jul 7 2020, 4:34 PM

Spinster removed a subscriber: • SandraF_WMF.

Spinster subscribed.

Spinster added a subscriber: Ciell.Jul 7 2020, 4:48 PM

Gehel moved this task from Incoming to Feature Requests on the Wikidata-Query-Service board.Jul 13 2020, 12:34 PM

As we try to split the graph as much as we can I think the proper approach to this would be store this data into a dedicated graph exposed through its own sparql endpoint and connected to wdqs through sparql federation.
This is not something the Search Team may have time to work on in the near term so if someone has the bandwidth to setup such endpoint we'd be happy to update the federation endpoinds whitelist with such service.

Halfak moved this task from Unsorted to Backlog/Lift Wing on the Machine-Learning-Team board.Jul 13 2020, 4:46 PM

Ciell added a subscriber: MichellevL_WMNL.Oct 6 2020, 10:44 AM

Salgo60 subscribed.Dec 14 2020, 4:54 AM

Maintenance_bot moved this task from Backlog/Lift Wing to Backlog/Revscoring on the Machine-Learning-Team board.Jan 19 2021, 11:36 PM

Gehel closed this task as a duplicate of T266828: Expose ORES item quality to WDQS.Nov 18 2021, 3:50 PM

Wakelamp subscribed.Sep 24 2022, 6:07 AM

Add ORES article quality predictions to the WDQSClosed, DuplicatePublicActions

Description

Related Objects

Event Timeline

Add ORES article quality predictions to the WDQS
Closed, DuplicatePublic
Actions