New revisions occasionally created with wrong text (but correct rev_len)
Closed, ResolvedPublic

Description

It appears that recently some edits on the English Wikipedia (possibly elsewhere too?) have resulted in revisions that are blank or contain text from other, unrelated pages. Oddly, the byte count reported in the page history (based on the rev_len field), as well the corresponding information in the recentchanges table, match the content that _should've_ been there.

For example, the revision http://en.wikipedia.org/w/index.php?title=Talk:Pikachu&oldid=227969847 is blank, even though the page history reports its length as 22,396 bytes. See also discussion at:

http://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#Bug:_revisions.2Fpagesizes.2Fpagerendering.2Fwikisource_not_matching_up.2C_resulting_in_blanking_or_page_replacements
http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard/Incidents#SYSTEM_BUG:_rollback_replaced_a_page_by_an_irrelevant_page_instead_of_reverting

I'm marking this as critical in case this is a symptom of more serious database corruption. Feel free to downgrade if it turns out to be something more benign.


Version: unspecified
Severity: critical
URL: http://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#Bug:_revisions.2Fpagesizes.2Fpagerendering.2Fwikisource_not_matching_up.2C_resulting_in_blanking_or_page_replacements

bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz14933.
Ilmari_Karonen created this task.Via LegacyJul 26 2008, 10:57 AM
bzimport added a comment.Via ConduitJul 26 2008, 11:11 AM

herd wrote:

from http://toolserver.org/~amidaniel/chanlogs/%23mediawiki/20080726.txt ->

[09:35:22] <Sadik_Khalid> Hi, when I tried to edit this page (http://ml.wikipedia.org/wiki/%E0%B4%B2%E0%B5%82%E0%B4%AF%E0%B4%BF_%E0%B4%AA%E0%B4%BE%E0%B4%B8%E0%B5%8D%E0%B4%9A%E0%B4%B0%E0%B5%8D%E2%80%8D) I am getting Egypt page (http://ml.wikipedia.org/wiki/Egypt)
[09:37:45] <Sadik_Khalid> History page don't mach with the content of the article

Ilmari_Karonen added a comment.Via ConduitJul 26 2008, 11:16 AM

Changing title since this occurs outside enwiki.

aaron added a comment.Via ConduitJul 26 2008, 11:20 AM

Possibly related to bug 14930

aaron added a comment.Via ConduitJul 26 2008, 11:28 AM

Also may be related to the recent ext. storage problems on one cluster (https://wikitech.leuksman.com/view/Server_admin_log)

aaron added a comment.Via ConduitJul 26 2008, 11:36 AM

OK, I can't find any relevant software changes. I'm almost sure this is due to the above issue. As things are now, as of now, no *new* edits should be recorded wrongly anymore.

bzimport added a comment.Via ConduitJul 26 2008, 5:15 PM

jeluf wrote:

This happened due to a master switch on the external storage cluster.

Apparently, the new master didn't have an up-to-date replica of the master, a few records were missing. Due to this, the same text IDs were used twice. The edits saved on the old master that were not replicated to the new master are lost, no way to get them back.

I have to close this bug as "FIXED" because there's no "CANTFIX"

tstarling added a comment.Via ConduitJul 26 2008, 8:30 PM

It wasn't fixed. srv104 still had an old copy of the configuration (because it's not reachable by ssh), and so it was still writing blobs to srv101. I've taken srv104 out of LVS rotation now. Maybe we'll be able to recover the edits from srv101 at some point, but it looks like it might be hanging on I/O now.

bzimport added a comment.Via ConduitJul 29 2008, 12:29 AM

jeroenvrp wrote:

I can confirm this on nl.wikipedia to.

See e.g. http://nl.wikipedia.org/w/index.php?title=Yang_Yaozu&diff=13286529&oldid=13139324

In the recent changes this revision have added 15 bytes, but the page is empty:
http://nl.wikipedia.org/w/index.php?title=Yang_Yaozu&action=edit&oldid=13286529

See also http://nl.wikipedia.org/w/index.php?title=Yang_Yaozu&action=history (2.159 bytes vs. 2.144 bytes).

bzimport added a comment.Via ConduitJul 29 2008, 12:30 AM

jeroenvrp wrote:

Ok I didn't saw it was fixed.

bzimport added a comment.Via ConduitJul 30 2008, 2:39 AM

herd wrote:

Unsure if related, but these do not show the revision #798283:

And yet, these do (sort of):

Although, Per VP/T Tim said:

It looks like the anomalous blank revisions are just cache pollution, and will
fix themselves when the cache expires in a week. The revisions that show the
wrong article are due to database corruption, and will need to be fixed manually.

bzimport added a comment.Via ConduitJul 31 2008, 7:25 PM

daniel wrote:

This edit is attributed to my bot
http://commons.wikimedia.org/w/index.php?title=Image%3AHyena_pup.jpg&diff=13062289&oldid=12189366

But it is pretty much impossible that the bot performed it (nothing remotely similar to CopyVio tagging is in the source code).

Might be due to the same server issue, although the nature of the glitch seems different from the ones reported.

Platonides added a comment.Via ConduitJul 31 2008, 9:00 PM

(In reply to comment #13)

But it is pretty much impossible that the bot performed it (nothing remotely
similar to CopyVio tagging is in the source code).

Might be due to the same server issue, although the nature of the glitch seems
different from the ones reported.

Also note that the length reported in the history is larger than the edit.
I understand this happens becaouse the write goes to the false master and
then the real one reuses the same revision id.

Probably we could find between the deleted revisions at a similar time,
another with that same content.

Another magic blanking:
http://es.wikipedia.org/w/index.php?title=Wikipedia:Vandalismo_en_curso&diff=19107113&oldid=19107017

tstarling added a comment.Via ConduitAug 1 2008, 1:19 AM

Should be fixed as of July 30, 03:00 UTC. Initially, ordinary edits processed by srv101/srv104 polluted the revision cache, which has an expiry of one week. This was identified and fixed (without me ever seeing this bug report) on July 27, by removing those servers from HTTP LVS. However, they continued to run the job queue, and refreshLinks jobs would have continued to pollute the revision cache. This was fixed on July 30, by firewalling srv101/104 from all core DB servers.

tstarling added a comment.Via ConduitAug 1 2008, 1:57 AM

I'm running a script to fix the revision cache. This will make the old revision view and old revision edit work properly. Any broken diffs will have to be fixed manually by appending &action=purge to the diff URL.

tstarling added a comment.Via ConduitAug 1 2008, 2:00 AM

Note that the script only affects page blankings (which are due to cache pollution), not replacement with unrelated text, which is due to corruption of the core DB with incorrect text rows referencing blob_ids on the old cluster17 master, srv101.

Add Comment