Page MenuHomePhabricator

Evaluate and Quantify the state of multilingual labels on Wikidata
Closed, ResolvedPublic12 Estimated Story Points

Description

Summary

In preparation for multilingual "depicts" statements on media as part of Structured Data on Commons, we need to asses and quantify problems with labels for Wikidata Q items being inconsistent or incomplete. This is especially problematic with animals and plants because of inconsistencies with taxon names vs. Common names.

The problem in more detail

See this Google Presentation deck for a more detailed list of examples: https://docs.google.com/presentation/d/1dANu94y9AA16t6fN_i_hagqlsdI92qMmezhIb14F-bo/

What we need

a.) Data analysis for generating reports, analysis, and/or research into quantifying the scale of the problem (how many concepts are affected, whether it's primarily a problem with taxons or other types of data, etc).

This is hard problem, and perhaps somewhat abstract, so we'll defer to the expertise of the data expertise on what is feasible.

b.) Assistance with developing potential solutions to the data problem.

Timeline

Our need here is important but not time-sensitive at the moment, so having at least a first round of analysis by mid-August would suffice.

Event Timeline

mpopov updated the task description. (Show Details)
mpopov moved this task from Triage to Backlog on the Product-Analytics board.
mpopov set the point value for this task to 12.

Current plan is for me to work on this after I finish T172581 in July, and will most likely be able to begin in August.

Vvjjkkii renamed this task from Evaluate and Quantify the state of multilingual labels on Wikidata to pfaaaaaaaa.Jul 1 2018, 1:02 AM
Vvjjkkii removed mpopov as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed the point value for this task.
Vvjjkkii added a subscriber: mpopov.
JJMC89 renamed this task from pfaaaaaaaa to Evaluate and Quantify the state of multilingual labels on Wikidata.Jul 1 2018, 4:34 AM
JJMC89 assigned this task to mpopov.
JJMC89 lowered the priority of this task from High to Medium.
JJMC89 raised the priority of this task from Medium to Needs Triage.
JJMC89 updated the task description. (Show Details)
JJMC89 set the point value for this task to 12.
JJMC89 removed a subscriber: mpopov.

Finished what I could and the findings are available over at https://people.wikimedia.org/~bearloga/reports/wikidata-incompleteness.html

@kzimmerman we need to meet with @Ramsey-WMF to discuss the report and next steps

Hey @Ramsey-WMF, I think our work is done (we met with you and Amanda to discuss back in December or so), but this ticket hadn't been closed. Closing as resolved.