Page MenuHomePhabricator

Unsorted Revisions in XML Dumps
Closed, DuplicatePublic

Description

Hi, in the xml dumps you can easily spot unsorted revisions ... is this a wanted behavior?

From https://dumps.wikimedia.org/itwiki/20180601/itwiki-20180601-pages-meta-history3.xml-p1599926p1689740.7z the very first page starts with:

<page>
  <title>Mayapan</title>
  <ns>0</ns>
  <id>1599928</id>
  <revision>
    <id>77445723</id>
    <parentid>77194693</parentid>
    <timestamp>2015-12-29T04:10:34Z</timestamp>
    ...
  </revision>
  <revision>
    <id>84181325</id>
    <parentid>77850253</parentid>
    <timestamp>2016-11-07T15:59:47Z</timestamp>
    ...
  </revision>
  <revision>
    <id>14317000</id>
    <parentid>14054133</parentid>
    <timestamp>2008-02-23T12:26:50Z</timestamp>
    ...
  </revision>
  <revision>
    <id>14054038</id>
    <timestamp>2008-02-11T19:23:21Z</timestamp>
    ...
  </revision>
  <revision>
    <id>14054101</id>
    <parentid>14054038</parentid>
    <timestamp>2008-02-11T19:26:55Z</timestamp>
    ...
  </revision>
  <revision>
    <id>14054107</id>
    <parentid>14054101</parentid>
    <timestamp>2008-02-11T19:27:14Z</timestamp>
  ...

Keep up the great work 💪
Enrico Bonetti Vieno

Event Timeline

Yes; this is slowly being addresses in T29112.

Now I see, thanks ... I even searched for it.

Have a nice day, Enrico

Vvjjkkii renamed this task from Unsorted Revisions in XML Dumps to xtaaaaaaaa.Jul 1 2018, 1:03 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
JJMC89 renamed this task from xtaaaaaaaa to Unsorted Revisions in XML Dumps.Jul 1 2018, 1:27 AM
JJMC89 raised the priority of this task from High to Needs Triage.
JJMC89 updated the task description. (Show Details)
JJMC89 added a subscriber: Aklapper.
ArielGlenn moved this task from Backlog to Done on the Dumps-Generation board.Jul 2 2018, 12:24 PM