Page MenuHomePhabricator

Enhance mediawiki-history page reconstruction with best historical information possible
Open, LowPublic


This task is about providing as correct and precise as possible data about deletes and restores patterns.

First patch of restore semantic revision has been made (see T179690). In that patch, historical data is set to null before any restore in a page history.

Idea here is to enhance what exists:

  • Add historical threads for pages with restores (revisions in that page might have been merged-in from different pages, list those, with correct times of overlap)
  • Provide correct historical titles when available (single historical thread), or a list of potential historical titles.
  • Join archived revisions to pages by title

Some examples of complex flows of events are described here and here

Event Timeline

JAllemandou renamed this task from Fix mediawiki-history page reconstruction bug (restores final) to Enhance mediawiki-history page reconstruction with best historical information possible.Nov 23 2017, 1:07 PM

For complex flows of events that we can't reconcile automatically with a somewhat high degree of certainty, we had an idea:

describe the problem in a way that's easy to illustrate to the community of researchers, and allow them to piece together the puzzle. Capture the output of their work in a way that we can incorporate in the reconstruction algorithm, and use it to fix the ambiguous histories.

And for the rest of the task, see if there's any way we can be more certain about some of the less complex flows.

Milimetric moved this task from Backlog (Later) to Data Quality on the Analytics board.