Page MenuHomePhabricator

Characterize the accessibility of sources cited in Wikipedia
Closed, ResolvedPublic

Description

Using data from Unpaywall and the citation with identifiers dataset (T185065):

  • Analyse accessibility of sources (primarily scholarly articles) across languages and topics
  • Quantify and identify the proportion of sources having open access copies available
  • (stretch) release a dataset to facilitate volunteer contributions towards complementing DOI citations with accessible versions of the same papers

Event Timeline

DarTar renamed this task from Understanding Citation Accessibility to Characterize the accessibility of sources cited in Wikipedia.Apr 18 2018, 10:20 PM
DarTar reassigned this task from DarTar to Miriam.
DarTar lowered the priority of this task from High to Medium.
DarTar edited projects, added Research; removed Epic, Research-Programs.
DarTar updated the task description. (Show Details)
DarTar removed a subscriber: Miriam.

Accessibility has multiple dimensions.

Those that are most relevant in this context are probably

  • language (P407)
  • online vs. offline (perhaps via P953 in conjunction with things like P1065 and P2960)
  • paywalls (kinda covered by P953 as well)
  • licensing (P275)

and perhaps file formats and other kinds of metadata.

Hi Daniel

Accessibility has multiple dimensions.

Those that are most relevant in this context are probably

  • language (P407)
  • online vs. offline (perhaps via P953 in conjunction with things like P1065 and P2960)
  • paywalls (kinda covered by P953 as well)
  • licensing (P275)

and perhaps file formats and other kinds of metadata.

Agreed, this first stab at the question will only work at what's available from the Unpaywall dataset, combined with topic modeling.