Page MenuHomePhabricator

Finding biases and content gaps in Wikidata
Open, HighPublic

Description

Wikidata is (and never will be) complete. However we should be aware of gaps and biases in our data to make informed decisions about where to put more effort.

The task is to find ways to find and surface biases and gaps in the data in Wikidata.

Some interesting questions:

  • What are underrepresented topics? Languages?
  • Are there biases in the kinds of statements we make about topics?

Event Timeline

samuwmde created this task.Feb 19 2016, 4:44 PM
Lydia_Pintscher removed johl as the assignee of this task.Feb 19 2016, 5:08 PM
Lydia_Pintscher added a project: Wikidata.
Lydia_Pintscher removed a subscriber: hoo.
Ricordisamoa added a subscriber: Ricordisamoa.
Lydia_Pintscher triaged this task as High priority.
johl removed a subscriber: johl.Jun 14 2016, 1:26 PM
abian added a subscriber: abian.Oct 5 2016, 8:30 PM
Jane023 added a subscriber: Jane023.Nov 5 2016, 3:13 PM

Shouldn't his be split into biases and content gaps? The first is hard to measure (involves the community of editors personally - e.g. who they are groupwise, such as gender, nationality, age, education etc) and the second is a bit easier (coverage of geo-locations, topics per expert ontology, neutral POV for breaking news, etc.)

@Jane023: Sorry maybe the wording is a bit confusing. It is about biases in the content which lead to gaps. Maybe it is not the best choice of word.

Yes I would drop the word "bias" then. It is about identifying and
measuring gaps, no? There should be a link somewhere to the bias side of
things (not sure where that is - diversity?)

Lydia_Pintscher renamed this task from Finding biases and content gaps to Finding biases and content gaps in Wikidata.Apr 7 2018, 11:30 AM
Lazhar added a subscriber: Lazhar.May 14 2018, 6:24 PM

Are we talking about the completeness of the entities within an arbitrary group or type of items, or potentially missing entities altogether?

abian added a comment.May 14 2018, 7:11 PM

Note that biases, or gaps, can only be quantified when compared to some given models that we consider perfect. As a first step, we should define what 'perfect' means to us and make sure this condition is achievable. This definition would be subjective and possibly controversial, so I think it would need broad discussion.

Bmueller updated the task description. (Show Details)May 18 2018, 5:08 AM
Bmueller added a subscriber: Addshore.