Page MenuHomePhabricator

Taxonomy of re-use and current knowledge of the effect on traffic to Wikimedia
Closed, ResolvedPublic


There are many different ways in which Wikimedia content is re-used outside of Wikimedia sites. This task will seek to identify these different ways, categorize them, and identify what we currently know about how each type affects traffic to Wikimedia.

Event Timeline

High-level summary is below. The doc has more examples and collected research that I am working towards moving to Meta so it will be accessible. Categories in the taxonomy:

  • Mirrors / Portals / Offline Access
    • Wikimedia content that can be viewed in full outside of Wikimedia. This ranges from very laudable projects that aim to make Wikipedia accessible to areas without good internet access to just a different interface for Wikipedia content that is arguably an improvement (usually with the addition of advertisements though) to malicious bulk copying of content without providing links or attribution (piracy).
  • Positive Intertwining (Linked Open Data)
  • Direct Search
    • This covers instances in which outside services provide a direct search into Wikimedia. This is different from Google Search etc. because it is only indexing Wikipedia and often has unclear referral information.
  • Snippets
    • These are examples where snippets of Wikimedia content are algorithmically evaluated against other sources and then surfaced on platforms outside of Wikimedia projects with attribution and links back to Wikimedia where required. These are generally in good-faith but the long-term impact on Wikimedia is unclear and the details vary greatly.
  • Automatic Fact-checking
    • These are instances where links back to Wikipedia are automatically inserted by platforms into their site to provide context about sources (e.g., BBC, RT) or problematic content like conspiracy theories. It is similar to snippets but the context is very specific and generally Wikipedia is the only source considered.
  • Human-generated References / reuse
    • These are organic links to Wikipedia that are generated by users on external platforms that can help surface Wikimedia content to readers on the web