Page MenuHomePhabricator

Errors in the revision table
Closed, ResolvedPublic

Description

query results

there are 654 revisions in en.wp database where rev_page =0. the most recent revision is 399262554 which was made at 20101128034211. Ive attached a full listing of affected revisions for en.wp. I suspect this is a larger scale problem affecting multiple wikis.


Version: unspecified
Severity: major

Attached:

Details

Reference
bz26223

Related Objects

StatusSubtypeAssignedTask
OpenFeatureNone
ResolvedNone

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 11:14 PM
bzimport set Reference to bz26223.
bzimport added a subscriber: Unknown Object (MLST).

http://toolserver.org/~betacommand/reports/dberrors/
is a full report excluding s7 there are a total of 101 projects that are affected by this issue

I had a look at these. I checked en wp carefully, and all of the incidents that aren't pretty old, before a certain date in 2008 (i.e. rev 242099935 on, and that's most of them) are moves. It turns out that this is true for most revs that I spot checked on the other projects as well.

So what happens with these moves? There are two revisions with the same move recorded in the log and the history; however three of them make it into the revision table.

Here's a sample from en wp:

rev_id | rev_page | rev_text_id | rev_comment
398410443 | 11005908 | 399836293 | moved [[Wikipedia:Tutorial (Editing)/sandbox]] to [[Wikipedia:Tutorial/Editing/sandbox]]...
398410417 | 11005908 | 399836293 | moved [[Wikipedia:Tutorial (Editing)/sandbox]] to [[Wikipedia:Tutorial/Editing)/sandbox]]...

those show up in the history, and they are the "good" ones, as they have a page id attached. The "bad" one is

398410444 | 0 | 399843549 | moved [[Wikipedia:Tutorial (Editing)/sandbox]] to [[Wikipedia:Tutorial/Editing/sandbox]]...

I looked at a number of these and they all display the same characteristics:

the third rev is the bad one, it has the same time stamp as the previous one, and its text content is the redirect left behind by the move.

Ie. the revision length of the bad one in the above is 48 and the text content is #REDIRECT [[Wikipedia:Tutorial/Editing/sandbox]]
where the rev length of the other two revisions is 2806 and they contain the actual page content.

This move issue is an outstanding issue, that is, it is not due to the master slave issue we had recently or any of that, That's clear from the timestamps, which in the above example well predate that outage. In case someone might think that the revision used to have the page id once upon a time and that the corruption occurred later, I checked the history dumps from July and Sept of last year for a couple of these revisions with earlier time stamps, and the two good ones in each case appeared in the file but not the bad one. That makes me pretty sure this is a failure at the time of the move, and probably still a bug in the code running now.

I hope that's enough information for someone who knows the innards of the move/delete stuff to hazard a guess at the problem.

There's no check in Title::moveToInternal() that Article::insertOn() really suceeded, so if that failed, the new revision would be created linking to a page with $newid = false, which would be converted to 0. Article::insertOn() fails if there's already a page with that title, we just renamed the page, so there should be no title with that page, and there's no trace of anyone recreating it behind us.

It is interesting that the page was first misrenamed, but I don't see any trace of what it did after moving to [[Wikipedia:Tutorial/Editing)/sandbox]]

This move was:
[[Wikipedia:Tutorial (Editing)/sandbox]]->[[Wikipedia:Tutorial/Editing)/sandbox]]
[[Wikipedia:Tutorial (Editing)/sandbox]]->[[Wikipedia:Tutorial/Editing/sandbox]]

How is this possible? Consider this: Fuhghettaboutit clicked to move the page, but noticed the typo immediatly, stopped the load, fixed the ')' and resubmitted. As Special:Movepage doesn't create a transaction, at that point *both requests were running at the same time* on the master. The second request fetched the old Article values, so moved the real article, not the redirect (maybe also because Title::moveto() does not call getArticleID() with GAID_FOR_UPDATE). But at the time of creating the redirect to the new entry, the first request had already created that. The insert ignore fails, but the revision is nonetheless inserted, leaking that entry.

I have been able to reproduce it locally.

touch lock
(while [ -f lock ]; do :; done; wget /index.php/Special:MovePage/Bug-26223 --post-data="action=submit&wpOldTitle=Bug-26223A&wpNewTitle=Bug-26223_$RANDOM&wpMove=yes&wpEditToken=%2B\\" )&
(while [ -f lock ]; do :; done; wget /index.php/Special:MovePage/Bug-26223 --post-data="action=submit&wpOldTitle=Bug-26223A&wpNewTitle=Bug-26223_$RANDOM&wpMove=yes&wpEditToken=%2B\\" )&
rm lock

For the record, the above bug (cause of revision leaking) was fixed in r84459.

(In reply to comment #4)

For the record, the above bug (cause of revision leaking) was fixed in r84459.

Closing