In trunk, the query run in Export.php in dumpFrom() (used for generating stub history files) is
SELECT * FROM page INNER JOIN revision ON ((page_id=rev_page)) WHERE page_id >= 1157 AND page_id < 1158 ORDER BY page_id ASC;
Revisions don't get explicitly ordered. This results in the order changing from one dump to another.
Example:
el.wiktionary dumps, page name υγεία, page id 1157, revid 1432 timestamp 2005-02-27T15:34:30Z either appears first in the revisions listed in the stubs-meta-history file because it has the earliest timestamp, or 4th because it's 4th if revisions are sorted by revid.
Smallest revid for that page is actually 1153 with timestamp 2005-02-27T15:34:45Z.
In fact the order seems to be chosen randomly depending on when the search is run:
elwiktionary-20100401-stub-meta-history.xml.gz -- revid 1432 is first
elwiktionary-20100505-stub-meta-history.xml.gz -- revid 1432 is 4th, 1153 is first
elwiktionary-20110123-stub-meta-history.xml.gz -- revid 1432 is first
Need to go through the code and make sure every such query has an explicit
order for revisions.
Also... need to find out why bigger revid has earlier timestamp (since in theory revids get assigned in order as used).
Version: unspecified
Severity: normal