Page MenuHomePhabricator

Start formal collaboration on understanding the use of maintenance templates
Open, Needs TriagePublic

Description

We are starting a new research project on understanding the use of maintenance templates as part of a formal collaboration (see project brief, internal only).

We are planning to work on the following during Q2:

  • create meta page
  • set up formal collaboration
  • onboard collaborators
  • collect relevant templates (see data and code)
  • collect dataset of usage of templates
  • (stretch) exploratory analysis

Event Timeline

weekly update:

  • I put together a notebook to collect relevant cleanup-templates across Wikipedia (see data and code)
    • this starts from cleanup-templates in English Wikipedia: Wikipedia:Template_index/Cleanup and the templates contained in Category:Cleanup_templates. This yields ~500 different templates
    • we then get the corresponding templates in other Wikipedia language versions using the Langlinks-API. This yields ~8K templates across all Wikipedias.
    • we also add Wikidata qids (to match templates across languages) and all redirect titles (in order to extract usage of aliases in wikitext).

weekly update:

  • shared resources with collaborators
  • discussing first steps

weekly update:

  • no update this week
  • will have coordination meeting with collaborators next week

weekly update:

  • scoped the project with collaborators. they will start drafting a meta-page.

weekly update:

  • reached out to Legal for MOU/NDA
  • started technical onboarding (e.g. creating accounts in phabricator, wikitech etc)

weekly update:

  • collaborators can now access stat-machines
  • only blocker is kerberos access in order to use hive tables in spark T410389: Request kerberos identity for AnkitaM resolved
  • next step is to start collecting the dataset of templates being added/removed

weekly update

  • starting data collection of revisions where maintenance templates are added or removed