Page MenuHomePhabricator

Measure sub-reference additions by individual edit
Closed, DeclinedPublic

Description

One of our project metrics is:

The number of human-performed edits adding at least one sub-reference reaches 2000 on dewiki. (Baseline: 0)

We'll get an estimate of this number by simply counting the sub-reference tags, in T409944: [priority] Scraper: count number of sub-references (on dewiki). But we also care about the number of edits and how these articles evolved, so we'll do some one-off additional work which analyzes as follows:

There was some related work in T400013: VisualEditor deletes list-defined references if there's a reference containing an ISBN and magic linking is enabled and T404421: [Bug] List-defined refs only used inside of a template are removed by visual editor which introduces a repository https://gitlab.com/wmde/technical-wishes/ref-damage to analyzes revisions looking for a particular type of edit. This is the basic approach we will take, but with a different way of targeting articles and revisions, and with much simpler detection logic.

Limitations

  • Won't find pages which once had sub-references but currently do not. Potentially this can be worked-around or better yet, the categorymembers API could provide a "historical member" option. The needed historical data seems to be available in the database.

Implementation

New repository: https://gitlab.com/wmde/technical-wishes/scrape-revision-history

Event Timeline

We don't need it for this year, since the number of pages is much higher than our target edit count.