Page MenuHomePhabricator

rev_parent_id not used correctly with diff=prev or rvdiffto=prev for some revisions (Conversion script?)
Closed, DuplicatePublic

Description

I have a script that is requesting diffs from the API. I've been running into some revisions where the diff that is returned is wrong and I can't figure out why. Let's examine an example revision:

https://en.wikipedia.org/wiki/?oldid=3110 (note that there is no link for "prev" revision)

If I find this revision in the edit history of the page and click on the "prev" link to the left of it, the following link is loaded with the appropriate diff:

https://en.wikipedia.org/w/index.php?title=Hebrew_grammar&diff=3110&oldid=257167

But if I manually specify diff=prev via the UI or API, I don't get any diff at all.

https://en.wikipedia.org/w/index.php?title=Hebrew_grammar&diff=prev&oldid=3110
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&revids=3110&rvdiffto=prev

I originally suspected that rev_parent_id was missing from the revision table, but it seems like it is populated as expected. See http://quarry.wmflabs.org/query/3379 which reports a rev_parent_id of 257167. But note that the previous revision (257167) was performed by "Conversion script"; that might be relevant.

Event Timeline

Halfak raised the priority of this task from to Needs Triage.
Halfak updated the task description. (Show Details)
Halfak subscribed.

rev_parent_id isn't used at all for diffs. Actually, I'm not sure what it is used for, or what its exact semantics are when auto-merged edit conflicts are involved.

The source of this issue is that the history page uses the revids from the list of revisions it already has to generate diff links, and that list is ordered by timestamp, while the actual "prev" pseudo-revision used by the diffing code orders by revision id instead. Most of the time this gives the same results, but for revisions that were imported later it doesn't.