Page MenuHomePhabricator

RevisionArchiveRecord incorrectly changes null ar_len to 0
Closed, ResolvedPublic

Description

When a RevisionArchiveRecord is created from an archive table row, it is incorrectly interpreting a null value for ar_len as 0 rather than as "I need to calculate the size".

When the RevisionArchiveRecord is used for undeletion, this results in the rev_len of the restored row incorrectly being 0.

  • Fix RevisionArchiveRecord
  • Update maintenance/populateRevisionLength.php for MCR and to account for this bug (and possibly merge it with populateRevisionSha1.php). Instead of only updating rows with null length, it should also update rows with 0 length and sha1 != 'phoiac9h4m842xq45sp7s6u21eteeq1'.
  • Run the updated populateRevisionLength.php.

Original report:

In the history of Wikipedia:Historical archive/Template:Substub, the revisions prior to 25 September 2005 as well as the moves by Graham87 are incorrectly shown as being empty. We need to fix the sizes (the rev_len fields) for those revisions. Also, to prevent this from ever happening again, we need to fix the ar_len fields for revisions deleted prior to MediaWiki 1.5, which now have ar_text_id and ar_rev_id fields per T36925 and T182678.

Event Timeline

IIRC it used to not even try to display the length in such cases, but it now displays it as empty. I can't think of any other examples of cases like this at the moment though.

Regarding the recent moves having 0 rev_len, they were most likely simply copied from rev_id 835081459 or 835038184 (both from 2005) that were originally affected by the bug. Most likely the bug is that a null ar_len or rev_len is somehow incorrectly getting converted to 0 on deletion and/or undeletion.

Anomie renamed this task from Some revisions incorrectly shown as "empty" to RevisionArchiveRecord incorrectly changes null ar_len to 0.Apr 17 2018, 2:22 PM
Anomie updated the task description. (Show Details)
Anomie edited projects, added Multi-Content-Revisions; removed MediaWiki-Page-diffs.
Anomie added subscribers: Addshore, daniel.

I'll fix RevisionArchiveRecord.

Change 427171 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] Fix handling of ar_length and ar_sha1 in RevisionArchiveRecord.

https://gerrit.wikimedia.org/r/427171

daniel triaged this task as High priority.Apr 17 2018, 4:54 PM

Bumping to "high" for the RevisionArchiveRecord revision fix, to avoid incorrect revisions being created on undeletion.
Udpating and running populateRevisionLength probably doesn't have high prio.

Change 427171 merged by jenkins-bot:
[mediawiki/core@master] Fix handling of ar_length and ar_sha1 in RevisionArchiveRecord.

https://gerrit.wikimedia.org/r/427171

We still need to update the populateRevisionLength.php script to fix the rev_len or ar_len field for revisions where the field is zero but is "supposed" to be a nonzero value.

Change 428291 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] Make populateRevisionLength fix rows with ar_len = 0.

https://gerrit.wikimedia.org/r/428291

Change 428391 had a related patch set uploaded (by Krinkle; owner: Daniel Kinzler):
[mediawiki/core@REL1_31] Fix handling of ar_length and ar_sha1 in RevisionArchiveRecord.

https://gerrit.wikimedia.org/r/428391

daniel lowered the priority of this task from High to Medium.Apr 23 2018, 6:37 PM

RevisionArchiveRecord bug is fixed, prio of the rest is not high.

Change 428291 merged by jenkins-bot:
[mediawiki/core@master] Make populateRevisionLength fix rows with ar_len = 0.

https://gerrit.wikimedia.org/r/428291

Change 428391 merged by jenkins-bot:
[mediawiki/core@REL1_31] Fix handling of ar_length and ar_sha1 in RevisionArchiveRecord.

https://gerrit.wikimedia.org/r/428391

Now we need to actually run the updated script on Wikimedia wikis. We should also cherry-pick the patch for populateRevisionLength.php to REL-1.31 as we did for the other patch.

Mentioned in SAL (#wikimedia-operations) [2018-04-25T15:22:21Z] <anomie> Running populateRevisionLength.php on group 0 for T192189

Mentioned in SAL (#wikimedia-operations) [2018-04-26T14:26:07Z] <anomie> Running populateRevisionLength.php on group 1 for T192189

Mentioned in SAL (#wikimedia-operations) [2018-04-27T14:23:41Z] <anomie> Running populateRevisionLength.php on group 2 for T192189

Anomie assigned this task to daniel.
Anomie updated the task description. (Show Details)

Script has been run on all wikis, so I'm resolving this. Assigning to @daniel since he wrote the patches here.

Change 430080 had a related patch set uploaded (by Legoktm; owner: Daniel Kinzler):
[mediawiki/core@REL1_31] Make populateRevisionLength fix rows with ar_len = 0.

https://gerrit.wikimedia.org/r/430080

Change 430080 merged by jenkins-bot:
[mediawiki/core@REL1_31] Make populateRevisionLength fix rows with ar_len = 0.

https://gerrit.wikimedia.org/r/430080