Wikidata is transcluded in other Wikimedia projects in a large variety of ways. For Wikipedia, this can be anything from determining the values in an infobox of an individual's day of birth, day of death, or nationality (quite visible to readers and important to get correct; see BLP) to providing descriptions that show up only on mobile to auto-generating metadata tables (e.g., a Library of Congress identifier) that is displayed at the end of the article on desktop to simply indicating that Wikidata contains coordinates for a given article even if that article is not using them.
The goal of this task is to categorize the range of different ways in which Wikidata is transcluded within Wikipedia and begin to identify how to measure the prevalence of each type of transclusion to add more nuance to the counts that can be derived from the wbc_entity_usage Mediawiki table. If this initial work goes well, I intend to expand it to Commons and the other projects as well.
- Develop taxonomy of Wikidata transclusion in Wikipedia: https://meta.wikimedia.org/wiki/Research:External_Reuse_of_Wikimedia_Content/Wikidata_Transclusion/Examples
- For each type of transclusion, develop method for measuring the prevalence of that transclusion
- Compare metrics based on wbc_entity_usage table to metrics that split out the different types of transclusion: https://meta.wikimedia.org/wiki/Research:External_Reuse_of_Wikimedia_Content/Wikidata_Transclusion/Examples/Article_Sample
T247099 (SQL definition for wikidata metrics for tuning session)
T246709 (What proportion of a Wikipedia article's edit history might reasonably be changes via Wikidata transclusion?)
Wikidata metrics from FY20 Q2 Tuning Session