Wikidata has the potential to be at the heart of many new features and products, notably new search and reader experiences. However, we don't have a clear picture of the content currently contained in Wikidata; this makes it difficult to design new tools and experiences, because we don't know what we can rely on.
For example, if an article doesn't exist in a given language, but we have information about that topic in Wikidata, some automatically-formatted information could be presented to the reader. That information could also be used to seed the article and encourage the reader to create it. However, such a feature relies on the assumption that the relevant data is available in Wikidata. If we don't know the breadth and coverage of the content in Wikidata, it makes it difficult to build experiences on it.
Other possible uses for Wikidata content include:
- multilingual search / "quick fact" type experiences
- powering locally maintained infoboxes,
- interactive timelines, maps, charts, etc.
- navigation within fact hierarchies (countries, politicians, books, albums, episode lists, movies, actors, etc.)
We need to perform a more systematic analysis regarding the current content in Wikidata and the growth patterns, in order to determine which purposes it is likely to be able to serve in the near term. This includes identifying content biases and clear gaps in content.
This should ideally be combined with an impact analysis in each of the uses outlined above. We should be able to quantify most of these things, by looking at % coverage in properties, languages, full tuples, etc.
In any such analysis, it's important to remember that specific Wikidata platform capabilities (e.g. unit support) may act as catalysts for larger adoption/use.
We should also compare the results with existing datasets such as DBPedia.
We should take a first rough cut at this in March and aim to provide a public report in April.