Maniphest T151975

Wikidata label quality model
Open, LowPublic
Actions

Assigned To

None

Authored By

	Halfak
	Nov 30 2016, 1:24 AM

Description

It would be great to have a model that predicts the quality of an edit to a label/description on Wikidata.

Use-case:

A user is using the mobile app to edit a label on Wikidata.
User sees a "this looks like potential error or vandalism" can you try again?
Upon second submission- "due to the nature of this answer, this will be reviewed before going live"

Event Timeline

Halfak created this task.Nov 30 2016, 1:24 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 30 2016, 1:24 AM

This would require substantial new development. Not clear that this is a pressing issue, so it will be hard to prioritize research and modeling time. Not to say it isn't important, but an argument is necessary.

In the end, it might be better to just make our editquality-modeling model for Wikidata better. We can probably do some effective text processing on labels, descriptions and other places that strings appear.

We'll want to check on the performance of the current model for differentiating good/bad label edits by newcomers/anons (readers). I'm guessing that it doesn't work that well yet because we don't do any text processing of labels, so it would be hard to get any interesting signal.

Halfak triaged this task as Low priority.Dec 1 2016, 3:30 PM

Halfak added a project: artificial-intelligence.Jan 20 2017, 8:39 PM

It is not AI approach, but it is quite effective - similarity of sitelinks to label is easy way to catch bad labels. Example quarry: https://quarry.wmflabs.org/query/15753 . Turning into AI approach it can be important feature. (similarity score of sitelink and label)
Similarity may be useful feature also when comparing labels of similar entities (people who share same first name, or last name in one language, expected to share names in other languages usually [except nicknames])

Harej moved this task from Ideas to Epic on the Machine-Learning-Team board.Apr 3 2019, 4:44 AM

• ACraze moved this task from Epic to Backlog/ORES on the Machine-Learning-Team board.Jan 19 2021, 8:41 PM

Wikidata label quality modelOpen, LowPublicActions

Description

Event Timeline

Wikidata label quality model
Open, LowPublic
Actions