Page MenuHomePhabricator

Unsorted Revisions in XML Dumps
Closed, DuplicatePublic

Description

Hi, in the xml dumps you can easily spot unsorted revisions ... is this a wanted behavior?

From https://dumps.wikimedia.org/itwiki/20180601/itwiki-20180601-pages-meta-history3.xml-p1599926p1689740.7z the very first page starts with:

<page>
  <title>Mayapan</title>
  <ns>0</ns>
  <id>1599928</id>
  <revision>
    <id>77445723</id>
    <parentid>77194693</parentid>
    <timestamp>2015-12-29T04:10:34Z</timestamp>
    ...
  </revision>
  <revision>
    <id>84181325</id>
    <parentid>77850253</parentid>
    <timestamp>2016-11-07T15:59:47Z</timestamp>
    ...
  </revision>
  <revision>
    <id>14317000</id>
    <parentid>14054133</parentid>
    <timestamp>2008-02-23T12:26:50Z</timestamp>
    ...
  </revision>
  <revision>
    <id>14054038</id>
    <timestamp>2008-02-11T19:23:21Z</timestamp>
    ...
  </revision>
  <revision>
    <id>14054101</id>
    <parentid>14054038</parentid>
    <timestamp>2008-02-11T19:26:55Z</timestamp>
    ...
  </revision>
  <revision>
    <id>14054107</id>
    <parentid>14054101</parentid>
    <timestamp>2008-02-11T19:27:14Z</timestamp>
  ...

Keep up the great work 💪
Enrico Bonetti Vieno

Event Timeline

Yes; this is slowly being addresses in T29112.

Now I see, thanks ... I even searched for it.

Have a nice day, Enrico

JJMC89 renamed this task from xtaaaaaaaa to Unsorted Revisions in XML Dumps.Jul 1 2018, 1:27 AM
JJMC89 raised the priority of this task from High to Needs Triage.
JJMC89 updated the task description. (Show Details)
JJMC89 added a subscriber: Aklapper.