Page MenuHomePhabricator

Warning: MediaWiki\Storage\SqlBlobStore::fetchBlob: Bad data in text row
Closed, ResolvedPublicPRODUCTION ERROR

Description

Seems something is corrupted on some wiki

5 MediaWiki\Storage\SqlBlobStore::fetchBlob: Bad data in text row 5191150. [Called from MediaWiki\Storage\SqlBlobStore::fetchBlob in /srv/mediawiki/php-1.32.0-wmf.18/includes/Storage/SqlBlobStore.php

Triggering URL: https://de.wikipedia.org/w/index.php?title=Ahmadiyya&limit=500&action=history

Logstash: https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-2018.08.29/mediawiki/?id=AWWFXFYy8VFZOIjHiQmC

Logstash search query: "Bad data in text row" AND type:"mediawiki"

Stacktrace:

#0 /srv/mediawiki/php-1.32.0-wmf.18/includes/debug/MWDebug.php(309): MWExceptionHandler::handleError(integer, string, string, integer, array, array)
#1 /srv/mediawiki/php-1.32.0-wmf.18/includes/debug/MWDebug.php(164): MWDebug::sendMessage(string, array, string, integer)
#2 /srv/mediawiki/php-1.32.0-wmf.18/includes/GlobalFunctions.php(1148): MWDebug::warning(string, integer, integer, string)
#3 /srv/mediawiki/php-1.32.0-wmf.18/includes/Storage/SqlBlobStore.php(356): wfLogWarning(string)
#4 /srv/mediawiki/php-1.32.0-wmf.18/includes/Storage/SqlBlobStore.php(281): MediaWiki\Storage\SqlBlobStore->fetchBlob(string, integer)
#5 /srv/mediawiki/php-1.32.0-wmf.18/includes/libs/objectcache/WANObjectCache.php(1246): Closure$MediaWiki\Storage\SqlBlobStore::getBlob(boolean, integer, array, NULL)
#6 /srv/mediawiki/php-1.32.0-wmf.18/includes/libs/objectcache/WANObjectCache.php(1119): WANObjectCache->doGetWithSetCallback(string, integer, Closure$MediaWiki\Storage\SqlBlobStore::getBlob;515, array)
#7 /srv/mediawiki/php-1.32.0-wmf.18/includes/Storage/SqlBlobStore.php(284): WANObjectCache->getWithSetCallback(string, integer, Closure$MediaWiki\Storage\SqlBlobStore::getBlob;515, array)
#8 /srv/mediawiki/php-1.32.0-wmf.18/includes/Storage/RevisionStore.php(1382): MediaWiki\Storage\SqlBlobStore->getBlob(string, integer)
#9 /srv/mediawiki/php-1.32.0-wmf.18/includes/Storage/RevisionStore.php(1319): MediaWiki\Storage\RevisionStore->loadSlotContent(MediaWiki\Storage\SlotRecord, NULL, NULL, NULL, integer)
#10 [internal function]: Closure$MediaWiki\Storage\RevisionStore::emulateMainSlot_1_29#2(MediaWiki\Storage\SlotRecord)
#11 /srv/mediawiki/php-1.32.0-wmf.18/includes/Storage/SlotRecord.php(306): call_user_func(Closure$MediaWiki\Storage\RevisionStore::emulateMainSlot_1_29#2;510, MediaWiki\Storage\SlotRecord)
#12 /srv/mediawiki/php-1.32.0-wmf.18/includes/Storage/SlotRecord.php(512): MediaWiki\Storage\SlotRecord->getContent()
#13 /srv/mediawiki/php-1.32.0-wmf.18/includes/Storage/RevisionSlots.php(149): MediaWiki\Storage\SlotRecord->getSize()
#14 [internal function]: Closure$MediaWiki\Storage\RevisionSlots::computeSize(integer, MediaWiki\Storage\SlotRecord)
#15 /srv/mediawiki/php-1.32.0-wmf.18/includes/Storage/RevisionSlots.php(150): array_reduce(array, Closure$MediaWiki\Storage\RevisionSlots::computeSize;603, integer)
#16 /srv/mediawiki/php-1.32.0-wmf.18/includes/Storage/RevisionStoreRecord.php(160): MediaWiki\Storage\RevisionSlots->computeSize()
#17 /srv/mediawiki/php-1.32.0-wmf.18/includes/Revision.php(707): MediaWiki\Storage\RevisionStoreRecord->getSize()
#18 /srv/mediawiki/php-1.32.0-wmf.18/includes/actions/HistoryAction.php(726): Revision->getSize()
#19 /srv/mediawiki/php-1.32.0-wmf.18/includes/actions/HistoryAction.php(486): HistoryPager->historyLine(stdClass, stdClass, boolean, boolean, boolean)
#20 /srv/mediawiki/php-1.32.0-wmf.18/includes/pager/IndexPager.php(439): HistoryPager->formatRow(stdClass)
#21 /srv/mediawiki/php-1.32.0-wmf.18/includes/actions/HistoryAction.php(236): IndexPager->getBody()
#22 /srv/mediawiki/php-1.32.0-wmf.18/includes/actions/FormlessAction.php(43): HistoryAction->onView()
#23 /srv/mediawiki/php-1.32.0-wmf.18/includes/MediaWiki.php(501): FormlessAction->show()
#24 /srv/mediawiki/php-1.32.0-wmf.18/includes/MediaWiki.php(294): MediaWiki->performAction(Article, Title)
#25 /srv/mediawiki/php-1.32.0-wmf.18/includes/MediaWiki.php(868): MediaWiki->performRequest()
#26 /srv/mediawiki/php-1.32.0-wmf.18/includes/MediaWiki.php(525): MediaWiki->main()
#27 /srv/mediawiki/php-1.32.0-wmf.18/index.php(42): MediaWiki->run()
#28 /srv/mediawiki/w/index.php(3): include(string)
#29 {main}

Event Timeline

Uneducatedly adding Multi-Content-Revisions as the stacktrace mentions RevisionStore and emulateMainSlot. Feel very free to correct.

Krinkle added a subscriber: Krinkle.

Might be related to T203255. Tagging with CPT for further investigation.

Might be related to T203255.

Considering ApiComparePages isn't in the stack trace, that seems very unlikely.

Tagging with CPT for further investigation.

The Multi-Content-Revisions tagging seems more likely relevant to me, although in the end it looks like it's really an instance of T22757: Corruption of text from early 2005 due to HistoryBlobStub pointers broken by recompressTracked.php.

This specific error seems to be due to a revision from 2005: https://de.wikipedia.org/w/index.php?title=Ahmadiyya&oldid=5191150. That revision has a HistoryBlobStub pointing to revision 5810513 with a hash of 823ec0d61c9dc4dad1029cc30c884892, but 5810513's ConcatenatedGzipHistoryBlob doesn't have an item with that hash.

If I ever get to get back to T181555: Remove use of PHP serialization in revision storage, the migration script for that will wind up fixing these broken references by flagging them as errors instead.

Noticed a sudden burst of the following in the logs. Same error as this task, although with a different stack.

channel: error
cli_argv: /srv/mediawiki/php-1.32.0-wmf.23/../multiversion/MWScript.php fetchText.php --wiki hrwiki
exception_id: afd331d39d8f662991b7b7c8
host: snapshot1009

PHP Warning: Revision::getRevisionText: Bad data in text row!

#1 /srv/mediawiki/php-1.32.0-wmf.23/includes/debug/MWDebug.php(309): trigger_error(string, integer)
#2 /srv/mediawiki/php-1.32.0-wmf.23/includes/debug/MWDebug.php(164): MWDebug::sendMessage(string, array, string, integer)
#3 /srv/mediawiki/php-1.32.0-wmf.23/includes/GlobalFunctions.php(1147): MWDebug::warning(string, integer, integer, string)
#4 /srv/mediawiki/php-1.32.0-wmf.23/includes/Revision.php(1073): wfLogWarning(string)
#5 /srv/mediawiki/php-1.32.0-wmf.23/maintenance/fetchText.php(86): Revision::getRevisionText(stdClass)
#6 /srv/mediawiki/php-1.32.0-wmf.23/maintenance/fetchText.php(63): FetchText->doGetText(Wikimedia\Rdbms\DatabaseMysqli, integer)
#7 /srv/mediawiki/php-1.32.0-wmf.23/maintenance/doMaintenance.php(94): FetchText->execute()
#8 /srv/mediawiki/php-1.32.0-wmf.23/maintenance/fetchText.php(96): require_once(string)
#9 /srv/mediawiki/multiversion/MWScript.php(100): require_once(string)

@Anomie Is this the same root cause as the one we see from "History" views in prod?

@ArielGlenn Do you know if there are any side-effect or impact from this error that we need to be aware of? Or does the snapshot script account for this issue and recovers without issue?

Apologies if this was already discussed previously, but I'm somewhat surprised to see a maintenance script run in Eqiad. In theory snapshot scripts are read-only which should be safe in Eqiad even when Codfw is primary, but MediaWiki isn't ready for active-active, which by default I'd assume means read operations are unsafe in Eqiad, because MediaWiki has side-effects on read operations (in particular around caching and job queue).

...

@ArielGlenn Do you know if there are any side-effect or impact from this error that we need to be aware of? Or does the snapshot script account for this issue and recovers without issue?

The script will move on to the next revision; we expect and account for buglets like these. Thanks for checking though!

Apologies if this was already discussed previously, but I'm somewhat surprised to see a maintenance script run in Eqiad. In theory snapshot scripts are read-only which should be safe in Eqiad even when Codfw is primary, but MediaWiki isn't ready for active-active, which by default I'd assume means read operations are unsafe in Eqiad, because MediaWiki has side-effects on read operations (in particular around caching and job queue).

It probably has been discussed elsewhere, but I'll summarize here: there has never been any plan to have dumps generation hosts in any other data center; they are in eqiad only. It's been this way for our past datacenter switchovers too. Note that, since Mediawiki scripts look at /etc/cluster to see which db config file to use, the fact that these are run from hosts in eqiad means that they use eqiad db servers; of course all of these requests are reads only.

@Anomie Is this the same root cause as the one we see from "History" views in prod?

Probably. There's no rev_id, text_id, or the like in that error message so it's hard to tell for sure.

Is this still happening? If not, can this ticket be closed?

Yes that is still occurring. They can be found in logstash with the search query: "Bad data in text row" AND type:"mediawiki"

Yep, I see a batch of them.. Here's one from today for hrwiki:

PHP Warning: MediaWiki\Storage\SqlBlobStore::fetchBlob: Bad data in text row 1677927. [Called from MediaWiki\Storage\SqlBlobStore::fetchBlob in /srv/mediawiki/php-1.33.0-wmf.23/includes/Storage/SqlBlobStore.php at line 353]

It's being triggered by a stubs run, which should not fetch revision text, so that concerns me quite a bit. That's new behavior since the March 20th run, and a regression of some kind, but I'll open another task for that.

https://hr.wikipedia.org/wiki/Franjeva%C4%8Dki_samostan_Sv._Antuna_Padovanskog_u_Koprivnici This url triggers the exception. Logstash entry: https://logstash.wikimedia.org/goto/2fc2e8da6b9cab496c8eb409b7347b21 Here's the relevant info from the page table, revision table for the current revision, and text table for that text id.

wikiadmin@10.64.0.205(hrwiki)> select * from page where page_id = 192386;
+---------+----------------+-----------------------------------------------------------+-------------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
| page_id | page_namespace | page_title                                                | page_restrictions | page_is_redirect | page_is_new | page_random    | page_touched   | page_links_updated | page_latest | page_len | page_content_model | page_lang |
+---------+----------------+-----------------------------------------------------------+-------------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
|  192386 |              0 | Franjevački_samostan_Sv._Antuna_Padovanskog_u_Koprivnici  |                   |                1 |           1 | 0.222437823375 | 20160904152621 | NULL               |     1705637 |       81 | wikitext           | NULL      |
+---------+----------------+-----------------------------------------------------------+-------------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
1 row in set (0.00 sec)

wikiadmin@10.64.0.205(hrwiki)> select * from revision where rev_id = 1705637;
+---------+----------+-------------+-------------+----------+---------------+----------------+----------------+-------------+---------+---------------+----------+-------------------+--------------------+
| rev_id  | rev_page | rev_text_id | rev_comment | rev_user | rev_user_text | rev_timestamp  | rev_minor_edit | rev_deleted | rev_len | rev_parent_id | rev_sha1 | rev_content_model | rev_content_format |
+---------+----------+-------------+-------------+----------+---------------+----------------+----------------+-------------+---------+---------------+----------+-------------------+--------------------+
| 1705637 |   192386 |     1677927 |             |    17775 | Fhms          | 20090309211443 |              0 |           0 |      81 |             0 |          | NULL              | NULL               |
+---------+----------+-------------+-------------+----------+---------------+----------------+----------------+-------------+---------+---------------+----------+-------------------+--------------------+
1 row in set (0.00 sec)

wikiadmin@10.64.0.205(hrwiki)> select * from text where old_id = 1677927;
+---------+---------------+-----------+------------------+-------------+----------+---------------+---------------+----------------+---------------------+-------------------+
| old_id  | old_namespace | old_title | old_text         | old_comment | old_user | old_user_text | old_timestamp | old_minor_edit | old_flags           | inverse_timestamp |
+---------+---------------+-----------+------------------+-------------+----------+---------------+---------------+----------------+---------------------+-------------------+
| 1677927 |               |           | DB://cluster20/0 |             |          |               |               |                | utf-8,gzip,external |                   |
+---------+---------------+-----------+------------------+-------------+----------+---------------+---------------+----------------+---------------------+-------------------+
1 row in set (0.00 sec)

We should do something about entries in the text table like this one that obviously points to crap. But I don't know what that something is.

https://logstash.wikimedia.org/goto/57a4c9ec510eb287f8f7d48a3db08f66 Another sample, these turned up in the abstract dumps for jvwiki. There's a workaround patch for that in wmf.25 which will mask the problem but the bad data will still be there.

wikiadmin@10.64.16.191(jvwiki)> select * from text where old_id = 209199;
+--------+---------------+-----------+------------------+-------------+----------+---------------+---------------+----------------+---------------------+-------------------+
| old_id | old_namespace | old_title | old_text         | old_comment | old_user | old_user_text | old_timestamp | old_minor_edit | old_flags           | inverse_timestamp |
+--------+---------------+-----------+------------------+-------------+----------+---------------+---------------+----------------+---------------------+-------------------+
| 209199 |               |           | DB://cluster20/0 |             |          |               |               |                | utf-8,gzip,external |                   |
+--------+---------------+-----------+------------------+-------------+----------+---------------+---------------+----------------+---------------------+-------------------+

Seeing 3659 of these (or pretty similar) since 03:30 UTC:

Triggering URL: https://zh.wikipedia.org/wiki/Delon_Thamrin

PHP Warning: MediaWiki\Storage\SqlBlobStore::fetchBlob: Bad data in text row 9375723. [Called from MediaWiki\Storage\SqlBlobStore::fetchBlob in /srv/mediawiki/php-1.34.0-wmf.17/includes/Storage/SqlBlobStore.php at line 361]
#0 /srv/mediawiki/php-1.34.0-wmf.17/includes/debug/MWDebug.php(309): MWExceptionHandler::handleError(integer, string, string, integer, array, array)
#1 /srv/mediawiki/php-1.34.0-wmf.17/includes/debug/MWDebug.php(164): MWDebug::sendMessage(string, array, string, integer)
#2 /srv/mediawiki/php-1.34.0-wmf.17/includes/GlobalFunctions.php(1078): MWDebug::warning(string, integer, integer, string)
#3 /srv/mediawiki/php-1.34.0-wmf.17/includes/Storage/SqlBlobStore.php(361): wfLogWarning(string)
#4 /srv/mediawiki/php-1.34.0-wmf.17/includes/Storage/SqlBlobStore.php(286): MediaWiki\Storage\SqlBlobStore->fetchBlob(string, integer)
#5 /srv/mediawiki/php-1.34.0-wmf.17/includes/libs/objectcache/wancache/WANObjectCache.php(1417): Closure$MediaWiki\Storage\SqlBlobStore::getBlob(boolean, integer, array, NULL)
#6 /srv/mediawiki/php-1.34.0-wmf.17/includes/libs/objectcache/wancache/WANObjectCache.php(1271): WANObjectCache->fetchOrRegenerate(string, integer, Closure$MediaWiki\Storage\SqlBlobStore::getBlob;2982, array)
#7 /srv/mediawiki/php-1.34.0-wmf.17/includes/Storage/SqlBlobStore.php(289): WANObjectCache->getWithSetCallback(string, integer, Closure$MediaWiki\Storage\SqlBlobStore::getBlob;2982, array)
#8 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/RevisionStore.php(1444): MediaWiki\Storage\SqlBlobStore->getBlob(string, integer)
#9 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/RevisionStore.php(1653): MediaWiki\Revision\RevisionStore->loadSlotContent(MediaWiki\Revision\SlotRecord, NULL, NULL, NULL, integer)
#10 [internal function]: Closure$MediaWiki\Revision\RevisionStore::constructSlotRecords(MediaWiki\Revision\SlotRecord)
#11 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/SlotRecord.php(307): call_user_func(Closure$MediaWiki\Revision\RevisionStore::constructSlotRecords;2978, MediaWiki\Revision\SlotRecord)
#12 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/RevisionRecord.php(175): MediaWiki\Revision\SlotRecord->getContent()
#13 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/RenderedRevision.php(226): MediaWiki\Revision\RevisionRecord->getContent(string, integer, NULL)
#14 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/RevisionRenderer.php(222): MediaWiki\Revision\RenderedRevision->getSlotParserOutput(string)
#15 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/RevisionRenderer.php(151): MediaWiki\Revision\RevisionRenderer->combineSlotOutput(MediaWiki\Revision\RenderedRevision, array)
#16 [internal function]: Closure$MediaWiki\Revision\RevisionRenderer::getRenderedRevision#3(MediaWiki\Revision\RenderedRevision, array)
#17 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/RenderedRevision.php(197): call_user_func(Closure$MediaWiki\Revision\RevisionRenderer::getRenderedRevision#3;3258, MediaWiki\Revision\RenderedRevision, array)
#18 /srv/mediawiki/php-1.34.0-wmf.17/includes/poolcounter/PoolWorkArticleView.php(196): MediaWiki\Revision\RenderedRevision->getRevisionParserOutput()
#19 /srv/mediawiki/php-1.34.0-wmf.17/includes/poolcounter/PoolCounterWork.php(125): PoolWorkArticleView->doWork()
#20 /srv/mediawiki/php-1.34.0-wmf.17/includes/page/Article.php(776): PoolCounterWork->execute()
#21 /srv/mediawiki/php-1.34.0-wmf.17/includes/actions/ViewAction.php(63): Article->view()
#22 /srv/mediawiki/php-1.34.0-wmf.17/includes/MediaWiki.php(507): ViewAction->show()
#23 /srv/mediawiki/php-1.34.0-wmf.17/includes/MediaWiki.php(302): MediaWiki->performAction(Article, Title)
#24 /srv/mediawiki/php-1.34.0-wmf.17/includes/MediaWiki.php(892): MediaWiki->performRequest()
#25 /srv/mediawiki/php-1.34.0-wmf.17/includes/MediaWiki.php(523): MediaWiki->main()
#26 /srv/mediawiki/php-1.34.0-wmf.17/index.php(42): MediaWiki->run()
#27 /srv/mediawiki/w/index.php(3): include(string)
#28 {main}

The entry in the text row points to a non-existent blob in a cluster.

wikiadmin@10.64.48.34(zhwiki)> select * from text where old_id = 9375723;
+---------+---------------+-----------+------------------+-------------+----------+---------------+---------------+----------------+---------------------+-------------------+
| old_id  | old_namespace | old_title | old_text         | old_comment | old_user | old_user_text | old_timestamp | old_minor_edit | old_flags           | inverse_timestamp |
+---------+---------------+-----------+------------------+-------------+----------+---------------+---------------+----------------+---------------------+-------------------+
| 9375723 |             0 |           | DB://cluster20/0 |             |        0 |               |               |              0 | utf-8,gzip,external |                   |
+---------+---------------+-----------+------------------+-------------+----------+---------------+---------------+----------------+---------------------+-------------------+
1 row in set (0.00 sec)

The id of 0 after the cluster20 address is the issue, just like other entries on this ticket.

This happened again today :https://logstash.wikimedia.org/app/kibana#/dashboard/Fatal-Monitor?_g=h@8446ffb&_a=h@40bd263)
Example

/srv/mediawiki/multiversion/MWScript.php dumpBackup.php --wiki=nlwiktionary --full --stub --report=1000 --output=file:/mnt/dumpsdata/xmldatadumps/temp/n/nlwiktionary/nlwiktionary-20190820-stub-meta-history.xml.gz.inprog_tmp --output=file:/mnt/dumpsdata/xmldatadumps/temp/n/nlwiktionary/nlwiktionary-20190820-stub-meta-current.xml.gz.inprog_tmp --filter=latest --output=file:/mnt/dumpsdata/xmldatadumps/temp/n/nlwiktionary/nlwiktionary-20190820-stub-articles.xml.gz.inprog_tmp --filter=latest --filter=notalk --filter=namespace:!NS_USER --skip-header --start=56300 --skip-footer --end 61300

PHP Warning: MediaWiki\Storage\SqlBlobStore::fetchBlob: Bad data in text row 4041. [Called from MediaWiki\Storage\SqlBlobStore::fetchBlob in /srv/mediawiki/php-1.34.0-wmf.17/includes/Storage/SqlBlobStore.php at line 361]

 MWExceptionHandler::handleError(integer, string, string, integer, array)
#1 /srv/mediawiki/php-1.34.0-wmf.17/includes/debug/MWDebug.php(309): trigger_error(string, integer)
#2 /srv/mediawiki/php-1.34.0-wmf.17/includes/debug/MWDebug.php(164): MWDebug::sendMessage(string, array, string, integer)
#3 /srv/mediawiki/php-1.34.0-wmf.17/includes/GlobalFunctions.php(1078): MWDebug::warning(string, integer, integer, string)
#4 /srv/mediawiki/php-1.34.0-wmf.17/includes/Storage/SqlBlobStore.php(361): wfLogWarning(string)
#5 /srv/mediawiki/php-1.34.0-wmf.17/includes/Storage/SqlBlobStore.php(286): MediaWiki\Storage\SqlBlobStore->fetchBlob(string, integer)
#6 /srv/mediawiki/php-1.34.0-wmf.17/includes/libs/objectcache/wancache/WANObjectCache.php(1416): MediaWiki\Storage\SqlBlobStore->MediaWiki\Storage\{closure}(boolean, integer, array, NULL)
#7 /srv/mediawiki/php-1.34.0-wmf.17/includes/libs/objectcache/wancache/WANObjectCache.php(1271): WANObjectCache->fetchOrRegenerate(string, integer, Closure, array)
#8 /srv/mediawiki/php-1.34.0-wmf.17/includes/Storage/SqlBlobStore.php(288): WANObjectCache->getWithSetCallback(string, integer, Closure, array)
#9 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/RevisionStore.php(1444): MediaWiki\Storage\SqlBlobStore->getBlob(string, integer)
#10 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/RevisionStore.php(1653): MediaWiki\Revision\RevisionStore->loadSlotContent(MediaWiki\Revision\SlotRecord, NULL, NULL, NULL, integer)
#11 [internal function]: MediaWiki\Revision\RevisionStore->MediaWiki\Revision\{closure}(MediaWiki\Revision\SlotRecord)
#12 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/SlotRecord.php(307): call_user_func(Closure, MediaWiki\Revision\SlotRecord)
#13 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/SlotRecord.php(551): MediaWiki\Revision\SlotRecord->getContent()
#14 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/RevisionSlots.php(200): MediaWiki\Revision\SlotRecord->getSha1()
#15 [internal function]: MediaWiki\Revision\RevisionSlots->MediaWiki\Revision\{closure}(NULL, MediaWiki\Revision\SlotRecord)
#16 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/RevisionSlots.php(202): array_reduce(array, Closure, NULL)
#17 /srv/mediawiki/php-1.34.0-wmf.17/includes/Revision/RevisionStoreRecord.php(174): MediaWiki\Revision\RevisionSlots->computeSha1()
#18 /srv/mediawiki/php-1.34.0-wmf.17/includes/export/XmlDumpWriter.php(309): MediaWiki\Revision\RevisionStoreRecord->getSha1()
#19 /srv/mediawiki/php-1.34.0-wmf.17/includes/export/XmlDumpWriter.php(390): XmlDumpWriter->invokeLenient(MediaWiki\Revision\RevisionStoreRecord, string, array, string)
#20 /srv/mediawiki/php-1.34.0-wmf.17/includes/export/WikiExporter.php(531): XmlDumpWriter->writeRevision(stdClass, array)
#21 /srv/mediawiki/php-1.34.0-wmf.17/includes/export/WikiExporter.php(474): WikiExporter->outputPageStreamBatch(Wikimedia\Rdbms\ResultWrapper, stdClass)
#22 /srv/mediawiki/php-1.34.0-wmf.17/includes/export/WikiExporter.php(288): WikiExporter->dumpPages(string, boolean)
#23 /srv/mediawiki/php-1.34.0-wmf.17/includes/export/WikiExporter.php(173): WikiExporter->dumpFrom(string, boolean)
#24 /srv/mediawiki/php-1.34.0-wmf.17/maintenance/includes/BackupDumper.php(289): WikiExporter->pagesByRange(integer, integer, boolean)
#25 /srv/mediawiki/php-1.34.0-wmf.17/maintenance/dumpBackup.php(82): BackupDumper->dump(integer, integer)
#26 /srv/mediawiki/php-1.34.0-wmf.17/maintenance/doMaintenance.php(99): DumpBackup->execute()
#27 /srv/mediawiki/php-1.34.0-wmf.17/maintenance/dumpBackup.php(144): require_once(string)
#28 /srv/mediawiki/multiversion/MWScript.php(101): require_once(string)

And it doesn't reference any es:

root@db1078.eqiad.wmnet[nlwiktionary]> select old_id,old_text from text where old_id=4041;
+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| old_id | old_text                                                                                                                                                                              |
+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|   4041 | %‹±^M^B1^P^Dó¯â$^B'˜^F¨à^C"^P^AÖ^E+ÞB'¬3²�>°¾bšà^LÙÎîNï¾ÀoÛ4õî5¿uä�sî‚$P^Kä)„¹^M„Væ£a^C^R3=b}•üŒJ¢ô»|ö¶ž¡tB^QÍ̃¯hr^_rÃbÖÿ^^Âm•ÚbIPk^O_                                                                  |
+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

I suppose it's just a broken gzip archive. I didn't manage to verify since I can't find an easy way to get the raw binary value out of the database.

If we have broken data in the database, there really isn't much MediaWiki can do except log warnings. If we don't have a way to fix the data, we could instead blank it out.

The nlwiktionary row with old_id=4041 does indeed seem to be an invalid gzip blob.

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:08 PM
Krinkle added a subscriber: Umherirrender.

From another task:

This seems to be due to the issue described in this task, and makes this title inaccessible on our site:

RevisionAccessException from line 1442 of /srv/mediawiki/php-1.35.0-wmf.4/includes/Revision/RevisionStore.php: Failed to load data blob from tt:1677927: Bad data in text row 1677927.

I checked a random one from oswiki which was associated witha revision with timestamp 20090309213655. The revision for that in the previous full dump has missing content and sha1 so this is not new. I guess the same will turn out to be true for the other revisions.

Seeing this again for a different revision/page:

id: AXJ62XSusWch-KE7JvV6

message: Failed to load data blob from tt:9375723: Bad data in text row 9375723.

stacktrace
#0 /srv/mediawiki/php-1.35.0-wmf.34/includes/Revision/RevisionStore.php(1014): MediaWiki\Storage\SqlBlobStore->getBlob(string, integer)
#1 /srv/mediawiki/php-1.35.0-wmf.34/includes/Revision/RevisionStore.php(1247): MediaWiki\Revision\RevisionStore->loadSlotContent(MediaWiki\Revision\SlotRecord, NULL, NULL, NULL, integer)
#2 [internal function]: MediaWiki\Revision\RevisionStore->MediaWiki\Revision\{closure}(MediaWiki\Revision\SlotRecord)
#3 /srv/mediawiki/php-1.35.0-wmf.34/includes/Revision/SlotRecord.php(307): call_user_func(Closure, MediaWiki\Revision\SlotRecord)
#4 /srv/mediawiki/php-1.35.0-wmf.34/includes/Revision/RevisionRecord.php(175): MediaWiki\Revision\SlotRecord->getContent()
#5 /srv/mediawiki/php-1.35.0-wmf.34/includes/page/WikiPage.php(798): MediaWiki\Revision\RevisionRecord->getContent(string, integer, NULL)
#6 /srv/mediawiki/php-1.35.0-wmf.34/includes/page/WikiPage.php(1056): WikiPage->getContent()
#7 /srv/mediawiki/php-1.35.0-wmf.34/includes/page/WikiPage.php(1043): WikiPage->insertRedirect()
#8 /srv/mediawiki/php-1.35.0-wmf.34/includes/page/WikiPage.php(1122): WikiPage->getRedirectTarget()
#9 /srv/mediawiki/php-1.35.0-wmf.34/includes/MediaWiki.php(450): WikiPage->followRedirect()
#10 /srv/mediawiki/php-1.35.0-wmf.34/includes/MediaWiki.php(303): MediaWiki->initializeArticle()
#11 /srv/mediawiki/php-1.35.0-wmf.34/includes/MediaWiki.php(978): MediaWiki->performRequest()
#12 /srv/mediawiki/php-1.35.0-wmf.34/includes/MediaWiki.php(535): MediaWiki->main()
#13 /srv/mediawiki/php-1.35.0-wmf.34/index.php(47): MediaWiki->run()
#14 /srv/mediawiki/w/index.php(3): require(string)
#15 {main}
thcipriani renamed this task from Warning: MediaWiki\Storage\SqlBlobStore::fetchBlob: Bad data in text row 5191150 to Warning: MediaWiki\Storage\SqlBlobStore::fetchBlob: Bad data in text row.Jun 3 2020, 6:23 PM
thcipriani raised the priority of this task from Medium to High.Oct 14 2020, 6:41 PM

Happening on hrwiki as of today (2020-10-14) with wmf.11: Failed to load data blob from tt:1677927: Bad data in text row 1677927. Use findBadBlobs.php to remedy.. If this problem persist, use the findBadBlobs maintenance script to investigate the issue and mark bad blobs.

Are these reports helpful? Should this be an exception if this means we need to run findBadBlobs? Should we just run that job on a schedule?

I'd like eventually to run it across all wikis and get a sense of the bad timeframes. Just my 2 cents.

Are these reports helpful? Should this be an exception if this means we need to run findBadBlobs? Should we just run that job on a schedule?

We should not run this script automatically in "mark" mode. That would mean we would ignore data corruption, and potentially make it worse. The cause should always be investigated. The script is potentially destructive, it basically tells MediaWiki to give up on the content of the revision. We should only do that if there is no chance of restoring the content that was lost.

As an example, if one of the External Store hosts has a temporary problem, we would be seeing this error. If we ran the script to mark the revisions, we'd permanently mark them as broken, instead of just letting things recover once the ES box has been taken out of rotation or got repaired.

We could run the script over all wikis to find corrupt revisions, but the script is slow. Right now, it can only be run against a known set of revisions or a limited time range.

In any case, MediaWiki should always report an error when reading a revision fails due to data corruption. Ignoring such incidences would be dangerous. The fact that we see this error again and again is that it's a symptom that can have a variety of causes.

Given that this error may be a symptom of a large variety of problems, from temporary network glitches to deep rooted logic errors, it probably doesn't make sense to treat all occurrences as instances of the same issue. Every time we see this, we need to check *why* the data can't be loaded. Only after we know that can we decide how the problem can be addressed. This means every instance of this should be tracked separately. Perhaps there should be a note to this effect in the description of this ticket. And perhaps this should become a tracking ticket, with individual occurrences as subtasks.

See T265989 where I have collected a bunch of bad revisions with timestamps across all the wikis. This may let us make some headway.

Krinkle claimed this task.

Marking as resolved because I'm unable to reproduce new entries for this based on the history of the dewiki "Ahmadiyya" article. I've browsed quite far bar and none of them yielded warnings. Per Daniel and Ariel above, this is too broad a category of issues to keep a single task about, and we in fact alreayd have dozens of tasks for this, so I'll just close this.

Note that I'm not assuming this issue to have been intermittent, rather we did a sweep recently to repair and/or mark various known empty/absent text blobs as such, thus no longer causing run-time warnings.