Finding biases and content gaps in Wikidata
Open, HighPublic
Actions

Assigned To

None

Authored By

	• samuwmde
	Feb 19 2016, 4:44 PM

Description

Wikidata is (and never will be) complete. However we should be aware of gaps and biases in our data to make informed decisions about where to put more effort.

The task is to find ways to find and surface biases and gaps in the data in Wikidata.

Some interesting questions:

What are underrepresented topics? Languages?
Are there biases in the kinds of statements we make about topics?

Related Objects
Search...

Status	Assigned	Task
Resolved	• johl	T127047 Collection of topics for HPI hackathon
Open	None	T90870 selfcontained projects around Wikidata (tracking)
Open	None	T127475 Finding biases and content gaps in Wikidata

Event Timeline

• samuwmde created this task.Feb 19 2016, 4:44 PM

Lydia_Pintscher removed • johl as the assignee of this task.Feb 19 2016, 5:08 PM

Lydia_Pintscher added a project: Wikidata.

Lydia_Pintscher removed a subscriber: hoo.

Lydia_Pintscher added a parent task: T90870: selfcontained projects around Wikidata (tracking).Feb 19 2016, 5:12 PM

Lydia_Pintscher updated the task description. (Show Details)Feb 19 2016, 5:25 PM

Ricordisamoa awarded a token.Feb 19 2016, 8:50 PM

Ricordisamoa subscribed.

• samuwmde removed a project: WMDE-Tech-Communication-Q1-2016.Mar 2 2016, 3:11 PM

Sumit added subscribers: Sumit, Halfak.Apr 2 2016, 8:48 PM

Lydia_Pintscher triaged this task as High priority.Apr 3 2016, 12:31 PM

Lydia_Pintscher moved this task from incoming to ready to go on the Wikidata board.

• johl unsubscribed.Jun 14 2016, 1:26 PM

abian subscribed.Oct 5 2016, 8:30 PM

Jane023 subscribed.Nov 5 2016, 3:13 PM

Shouldn't his be split into biases and content gaps? The first is hard to measure (involves the community of editors personally - e.g. who they are groupwise, such as gender, nationality, age, education etc) and the second is a bit easier (coverage of geo-locations, topics per expert ontology, neutral POV for breaking news, etc.)

@Jane023: Sorry maybe the wording is a bit confusing. It is about biases in the content which lead to gaps. Maybe it is not the best choice of word.

Yes I would drop the word "bias" then. It is about identifying and
measuring gaps, no? There should be a link somewhere to the bias side of
things (not sure where that is - diversity?)

Esc3300 mentioned this in T150116: Wikidata statistics: create a tool or WQS function to evaluate completeness of items of a given type or group of items.Nov 6 2016, 7:03 AM

Lydia_Pintscher added a project: Wikimedia-Hackathon-2017.May 12 2017, 4:05 PM

Lydia_Pintscher renamed this task from Finding biases and content gaps to Finding biases and content gaps in Wikidata.Apr 7 2018, 11:30 AM

Lydia_Pintscher added a project: Wikimedia-Hackathon-2018.

Lydia_Pintscher moved this task from Backlog to Project on the Wikimedia-Hackathon-2018 board.Apr 7 2018, 11:42 AM

Are we talking about the completeness of the entities within an arbitrary group or type of items, or potentially missing entities altogether?

Both.

Note that biases, or gaps, can only be quantified when compared to some given models that we consider perfect. As a first step, we should define what 'perfect' means to us and make sure this condition is achievable. This definition would be subjective and possibly controversial, so I think it would need broad discussion.

Bmueller updated the task description. (Show Details)May 18 2018, 5:08 AM

Bmueller added a subscriber: Addshore.

• MichaelSchoenitzer_WMDE subscribed.May 18 2018, 12:47 PM

Lydia_Pintscher updated the task description. (Show Details)Sep 16 2018, 12:11 PM

Darenwelsh subscribed.Aug 15 2019, 4:01 PM

Addshore unsubscribed.Jun 27 2023, 12:39 PM