Page MenuHomePhabricator

Specify new task for Linking articles as a structured tasks (Q1)
Closed, ResolvedPublic

Description

We want to develop another structured task around adding links to articles. Instead of adding outgoing links to a selected article (as we did in the first add-a-link structured task), one idea is to add incoming links to a selected article. This is a more challenging task since these links would have to be actually added to other pages, at the same time the value of those links might be higher:

  • The {{orphan}}-template tracks articles that have no incoming links
  • A recent paper by Langrock&González-Bailón showed that campaigns are very successful at improving the content of selected articles but less so in adding incoming links to mitigate known structural biases and increase the visibility
  • Brainstorm different options about linking-tasks
  • Specify the task around linking orphan-articles

Event Timeline

Update week 2021-08-23:

  • Formulated the motivation for the task and relevant literature in this googledoc
  • Scoped the task more clearly and sketched several approaches with varying degrees of difficulty. The most straightforward:
    • Identify articles that require inlinks, first approach: use orphaned articles, i.e. articles without incoming links
    • Identify candidate articles from which to link to the orphan article, first approach: use existing links from other language version
    • Identify text regions in the candidate article where to add the link, first approach: identify corresponding section in the language-version where the link exists (potentially using section-translation tool)
  • Defined next steps for exploratory analysis to investigate whether suggested approach could work
    • Generate set of orphan articles in a given language in need of links
    • Generate dataset with link-network of all Wikipedia language version as a multilayer network
    • Generate evaluation dataset of new links that were added in a given month

Update week 2021-09-06:

Update week 2021-10-04:

  • started work on exploratory analysis of orphaned articles across languages
  • generated a dataset of all links of all articles in all Wikipedias (57M articles and 3.2B links)
    • this will lay the foundation to understand how common the problem of orphan articles are (how many articles have no incoming links across languages; from those how many articles have no incoming links in all languages)
    • we can assess to which degree we can "translate" links that already exist in another article to recommend as an incoming link to an orphan-article in a specific language
  • discussed with Akhil the potential of a collaboration around this project