Page MenuHomePhabricator

Dumping of older revisions contained in xml files should not need db query
Open, MediumPublic

Description

It takes longer to generate a dump of old en wp revisions than the same number of newer ones, because the old ones have length mismatch (they weren't normalized when they went in) and so get pulled from the database directly. Doublecheck that this is the reason and figure out a work-around.

Event Timeline

ArielGlenn claimed this task.
ArielGlenn raised the priority of this task from to Medium.
ArielGlenn updated the task description. (Show Details)
ArielGlenn added a project: Dumps-Generation.
ArielGlenn subscribed.

This will be addressed in part by a change in https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/556346/11/maintenance/includes/TextPassDumper.php at line 927 (applies to everything that's not wikitext).

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)