Page MenuHomePhabricator

Citation ordering seems non-deterministic
Closed, ResolvedPublic

Description

Looks like the order in which citations are generated are not deterministic and ordered in terms of how they are seen on the page wikitext. So, the first citation on the page might start with [39] instead of [1]. This is so today @ http://parsoid.wmflabs.org/enwiki/Barack_Obama?oldid=614710126 -- it appears that the citations are ordered based on the order in which Parsoid's Cite implementation receives them which in turn depends on how subpipelines fire, etc. and is non-deterministic.

Our citation implementation needs fixing to match wikitext order -- perhaps by sorting on top level dsr for top level citations and template dsr for citations that are generated by templates.

This should also fix the irritating rt-test variations we occasionally see (and which had baffled us till now) where semantic diffs are triggered because of cite numbering changes.


Version: unspecified
Severity: normal

Details

Reference
bz67237

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:32 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz67237.

Moving the numbering to the DOM could help as well.

Everything is being done on the DOM right now, but there is no distinction between the dom of part of the page processed in a new pipeline vs. the top-level content DOM. The final generateRefs dom pass has to be restructured to do different things on the top-level doc and on pieces of the doc being processed in other pipelines.

I am going to restructure this now that I already added this distinction (top-level vs. not) as part of a recent token-stream-patcher commit. So far, looks like Cite and Linter also might do different things on the DOM based on whether the dom is the top-level dom or not.

Change 143362 had a related patch set uploaded by Subramanya Sastry:
(Bug 67237): WIP HACK: Fix citation numbering issue

https://gerrit.wikimedia.org/r/143362

Change 143362 merged by jenkins-bot:
(Bug 67237): Fix citation numbering issue

https://gerrit.wikimedia.org/r/143362