Page MenuHomePhabricator

Determine if there are consistently used top ranked Wikidata statements, and how many of them are there
Open, LowPublic

Description

As a user, if there is catastrophic data loss in Wikidata, I want to keep top-ranked statements from a statement group in order to maximize my ability to continue querying a reduced graph.

copied from Lydia's comment below:

Some background on ranks is here: https://www.wikidata.org/wiki/Help:Ranking

What we probably want here for the analysis is what we refer to as "best ranked statements". Best ranked statements are the statements with the highest rank in a statement group but not the ones that are deprecated. So if you have a statement group with a preferred and normal statement then the best ranked statement would be the one with the preferred rank. This "best ranked" concept is important for querying because it is what is used for the truthy queries and significantly reduces the size of the graph by removing statements that are historic for example.

Event Timeline

Some background on ranks is here: https://www.wikidata.org/wiki/Help:Ranking

What we probably want here for the analysis is what we refer to as "best ranked statements". Best ranked statements are the statements with the highest rank in a statement group but not the ones that are deprecated. So if you have a statement group with a preferred and normal statement then the best ranked statement would be the one with the preferred rank. This "best ranked" concept is important for querying because it is what is used for the truthy queries and significantly reduces the size of the graph by removing statements that are historic for example.