Page MenuHomePhabricator

Importing revisions with the FileImporter results in incorrect revision size diff values
Closed, ResolvedPublic5 Story Points

Description

Importing revisions with the FileImporter results in incorrect revision size diff values

This can be seen at https://test.wikimedia.beta.wmflabs.org/w/index.php?title=File:2013_Porsche_911_Carrera_4S_(991)_(9626546987).jpg&action=history

Note Are we calling the revisions in the wrong order? Is there something that we should be calling at the import that we are not? Look at the todos!
https://github.com/wikimedia/mediawiki-extensions-FileImporter/blob/master/src/Services/Importer.php#L204

Event Timeline

Restricted Application added a project: TCB-Team. · View Herald TranscriptFeb 6 2018, 3:50 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Tobi_WMDE_SW set the point value for this task to 5.Feb 13 2018, 4:28 PM
Lea_WMDE triaged this task as Normal priority.Mar 6 2018, 4:11 PM
Lea_WMDE updated the task description. (Show Details)

So it appears that when we import the revisions we don't link revisions with their parent revision. Therefore, in the database the entry 'rev_parent_id' is 0 for all the revisions that are being imported (not the revisions we create additionally). As a result, when the history class attempts to calculate the difference in bytes it takes each imported revision and compares it to the initial revision of the page. This leads to the byte difference being the same as revision's actual size.

https://www.mediawiki.org/wiki/Manual:Revision_table#rev_parent_id

Change 418944 had a related patch set uploaded (by Andrew-WMDE; owner: Andrew-WMDE):
[mediawiki/extensions/FileImporter@master] Import text revisions in the order of oldest to newest

https://gerrit.wikimedia.org/r/418944

Import the revisions in the order of oldest to newest so that during the import process the proper rev_parent_id can be determined for each revision in:
https://github.com/wikimedia/mediawiki/blob/efcd9f92e82a823b92e64486419cdfed6d34e40c/includes/import/ImportableOldRevisionImporter.php#L97

WMDE-Fisch added a subscriber: WMDE-Fisch.

Nice, that patch fixes the problem it seems :-)

Change 418944 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Import text revisions in the order of oldest to newest

https://gerrit.wikimedia.org/r/418944

Tobi_WMDE_SW closed this task as Resolved.Mar 20 2018, 4:11 PM