WD Languages Landscape: statistics + dashboards
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	GoranSMilovanovic
	May 13 2019, 3:50 PM

Description

Collect fundamental statistics from the external sources for the Wikidata Languages Landscape.
Develop reports, visualizations, and dashboards for the languages project.

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		GoranSMilovanovic	T221965 Wikidata Languages Landscape
		Resolved		GoranSMilovanovic	T223119 WD Languages Landscape: statistics + dashboards

Event Timeline

GoranSMilovanovic created this task.May 13 2019, 3:50 PM

GoranSMilovanovic moved this task from Technical Wishlist to Incoming on the User-GoranSMilovanovic board.

@Lydia_Pintscher @RazShuty

Something to begin with:

each node is a language (Wikimedia language codes are used);
each language points towards the three most similar languages to it,
in terms of the overlap in the respective language labels across >57M Wikidata items:
(explanation: for each language we search what WD items have a label in it,
then: similarity between two languages == Jaccard distance between two binary vectors of length approx. 57M each).

Mapping WDCM item re-use statistics onto languages now.

Nice! :)

GoranSMilovanovic renamed this task from WD Languages Landscape: fundamental statistics to WD Languages Landscape: statistics + dashboards.Oct 13 2019, 9:47 PM

GoranSMilovanovic updated the task description. (Show Details)

GoranSMilovanovic added a subscriber: WMDE-leszek.

@Lydia_Pintscher
You can take a look at our WikidataCon2019 shared doc and see if you can make use of anything from the Wikidata Languages Landscape: Statistics and Visualizations section.

Dashboard online.

	F30078182: WD_Languages.png
	Aug 23 2019, 7:43 PM

WD Languages Landscape: statistics + dashboardsClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

WD Languages Landscape: statistics + dashboards
Closed, ResolvedPublic
Actions

Related Objects
Search...