Page MenuHomePhabricator

abstracts dumps for dewikiversity fail with MWUnknownContentModelException from ContentHandler.php
Open, HighPublic0 Estimated Story Points

Description

This is a regression from the March 20th run.

Stack trace:

[ae9f68bbba5be0cbd1d7969d] [no req]   MWUnknownContentModelException from line 265 of /srv/mediawiki/php-1.33.0-wmf.24/includes/content/ContentHandler.php: The content model 'flow-board' i
s not registered on this wiki.
See https://www.mediawiki.org/wiki/Content_handlers to find out which extensions handle this content model.
Backtrace:
#0 /srv/mediawiki/php-1.33.0-wmf.24/includes/Revision/RevisionStore.php(1470): ContentHandler::getForModelID(string)
#1 /srv/mediawiki/php-1.33.0-wmf.24/includes/Revision/RevisionStore.php(1634): MediaWiki\Revision\RevisionStore->loadSlotContent(MediaWiki\Revision\SlotRecord, NULL, NULL, NULL, integer)
#2 [internal function]: MediaWiki\Revision\RevisionStore->MediaWiki\Revision\{closure}(MediaWiki\Revision\SlotRecord)
#3 /srv/mediawiki/php-1.33.0-wmf.24/includes/Revision/SlotRecord.php(307): call_user_func(Closure, MediaWiki\Revision\SlotRecord)
#4 /srv/mediawiki/php-1.33.0-wmf.24/includes/export/XmlDumpWriter.php(308): MediaWiki\Revision\SlotRecord->getContent()
#5 /srv/mediawiki/php-1.33.0-wmf.24/includes/export/WikiExporter.php(485): XmlDumpWriter->writeRevision(stdClass)
#6 /srv/mediawiki/php-1.33.0-wmf.24/includes/export/WikiExporter.php(445): WikiExporter->outputPageStreamBatch(Wikimedia\Rdbms\ResultWrapper, stdClass)
#7 /srv/mediawiki/php-1.33.0-wmf.24/includes/export/WikiExporter.php(269): WikiExporter->dumpPages(string, boolean)
#8 /srv/mediawiki/php-1.33.0-wmf.24/includes/export/WikiExporter.php(154): WikiExporter->dumpFrom(string, boolean)
#9 /srv/mediawiki/php-1.33.0-wmf.24/maintenance/includes/BackupDumper.php(288): WikiExporter->pagesByRange(integer, integer, boolean)
#10 /srv/mediawiki/php-1.33.0-wmf.24/maintenance/dumpBackup.php(83): BackupDumper->dump(integer, integer)
#11 /srv/mediawiki/php-1.33.0-wmf.24/maintenance/doMaintenance.php(96): DumpBackup->execute()
#12 /srv/mediawiki/php-1.33.0-wmf.24/maintenance/dumpBackup.php(138): require_once(string)
#13 /srv/mediawiki/multiversion/MWScript.php(100): require_once(string)
#14 {main}

More info shortly (how to reproduce, page id, etc) shortly.

Event Timeline

ArielGlenn triaged this task as High priority.Apr 10 2019, 11:10 AM
ArielGlenn created this task.

Reproduce by:

dumpsgen@snapshot1007:~$ /usr/bin/php7.2 /srv/mediawiki/multiversion/MWScript.php dumpBackup.php --wiki=dewikiversity /srv/mediawiki/php-1.33.0-wmf.24 --plugin=AbstractFilter:/srv/mediawiki/php-1.33.0-wmf.24/extensions/ActiveAbstract/AbstractFilter.php --current --report=1 --output=file:/mnt/dumpsdata/temp/dumpsgen/bad-abstracts-dewv-sigh.xml --filter=namespace:NS_MAIN --filter=noredirect --filter=abstract --skip-header --start=47279 --skip-footer --end 47280
[3b5b23e8293282f8aecb46dc] [no req]   MWUnknownContentModelException from line 265 of /srv/mediawiki/php-1.33.0-wmf.24/includes/content/ContentHandler.php: The content model 'flow-board' is not registered on this wiki.
See https://www.mediawiki.org/wiki/Content_handlers to find out which extensions handle this content model.
Backtrace:
#0 /srv/mediawiki/php-1.33.0-wmf.24/includes/Revision/RevisionStore.php(1470): ContentHandler::getForModelID(string)
#1 /srv/mediawiki/php-1.33.0-wmf.24/includes/Revision/RevisionStore.php(1634): MediaWiki\Revision\RevisionStore->loadSlotContent(MediaWiki\Revision\SlotRecord, NULL, NULL, NULL, integer)
#2 [internal function]: MediaWiki\Revision\RevisionStore->MediaWiki\Revision\{closure}(MediaWiki\Revision\SlotRecord)
#3 /srv/mediawiki/php-1.33.0-wmf.24/includes/Revision/SlotRecord.php(307): call_user_func(Closure, MediaWiki\Revision\SlotRecord)
#4 /srv/mediawiki/php-1.33.0-wmf.24/includes/export/XmlDumpWriter.php(308): MediaWiki\Revision\SlotRecord->getContent()
#5 /srv/mediawiki/php-1.33.0-wmf.24/includes/export/WikiExporter.php(485): XmlDumpWriter->writeRevision(stdClass)
#6 /srv/mediawiki/php-1.33.0-wmf.24/includes/export/WikiExporter.php(445): WikiExporter->outputPageStreamBatch(Wikimedia\Rdbms\ResultWrapper, NULL)
#7 /srv/mediawiki/php-1.33.0-wmf.24/includes/export/WikiExporter.php(269): WikiExporter->dumpPages(string, boolean)
#8 /srv/mediawiki/php-1.33.0-wmf.24/includes/export/WikiExporter.php(154): WikiExporter->dumpFrom(string, boolean)
#9 /srv/mediawiki/php-1.33.0-wmf.24/maintenance/includes/BackupDumper.php(288): WikiExporter->pagesByRange(integer, integer, boolean)
#10 /srv/mediawiki/php-1.33.0-wmf.24/maintenance/dumpBackup.php(83): BackupDumper->dump(integer, integer)
#11 /srv/mediawiki/php-1.33.0-wmf.24/maintenance/doMaintenance.php(96): DumpBackup->execute()
#12 /srv/mediawiki/php-1.33.0-wmf.24/maintenance/dumpBackup.php(138): require_once(string)
#13 /srv/mediawiki/multiversion/MWScript.php(100): require_once(string)
#14 {main}

Info for the page and the current revision (which is what is used for abstracts):

wikiadmin@10.64.16.191(dewikiversity)> select * from page where page_id = 47279;
+---------+----------------+--------------------------+-------------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
| page_id | page_namespace | page_title               | page_restrictions | page_is_redirect | page_is_new | page_random    | page_touched   | page_links_updated | page_latest | page_len | page_content_model | page_lang |
+---------+----------------+--------------------------+-------------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
|   47279 |           2600 | Was_wir_hören_und_sehen  |                   |                0 |           1 | 0.649918805112 | 20110807104642 | NULL               |      274772 |      352 | wikitext           | NULL      |
+---------+----------------+--------------------------+-------------------+------------------+-------------+----------------+----------------+--------------------+-------------+----------+--------------------+-----------+
1 row in set (0.01 sec)

wikiadmin@10.64.16.191(dewikiversity)> select * from revision where rev_id = 274772;
+--------+----------+-------------+-------------+----------+---------------+----------------+----------------+-------------+---------+---------------+---------------------------------+-------------------+--------------------+
| rev_id | rev_page | rev_text_id | rev_comment | rev_user | rev_user_text | rev_timestamp  | rev_minor_edit | rev_deleted | rev_len | rev_parent_id | rev_sha1                        | rev_content_model | rev_content_format |
+--------+----------+-------------+-------------+----------+---------------+----------------+----------------+-------------+---------+---------------+---------------------------------+-------------------+--------------------+
| 274772 |    47279 |      269201 |             |    11256 | MartinKurz    | 20110807104642 |              0 |           0 |     352 |             0 | o9mhxk86c2bxcvymcwsiirc2vyczgnp | NULL              | NULL               |
+--------+----------+-------------+-------------+----------+---------------+----------------+----------------+-------------+---------+---------------+---------------------------------+-------------------+--------------------+
1 row in set (0.00 sec)

wikiadmin@10.64.16.191(dewikiversity)> select * from slots where slot_revision_id = 274772;
+------------------+--------------+-----------------+-------------+
| slot_revision_id | slot_role_id | slot_content_id | slot_origin |
+------------------+--------------+-----------------+-------------+
|           274772 |            1 |          234228 |      274772 |
+------------------+--------------+-----------------+-------------+
1 row in set (0.00 sec)

wikiadmin@10.64.16.191(dewikiversity)> select * from content where content_id = 234228;
+------------+--------------+---------------------------------+---------------+-----------------+
| content_id | content_size | content_sha1                    | content_model | content_address |
+------------+--------------+---------------------------------+---------------+-----------------+
|     234228 |          352 | o9mhxk86c2bxcvymcwsiirc2vyczgnp |             4 | tt:269201       |
+------------+--------------+---------------------------------+---------------+-----------------+
1 row in set (0.00 sec)
wikiadmin@10.64.16.191(dewikiversity)> select * from content_models where model_id = 4;
+----------+------------+
| model_id | model_name |
+----------+------------+
|        4 | flow-board |
+----------+------------+
1 row in set (0.00 sec)

I suspect that T207626 plus an interaction with RevisonStore, newly used in dumps, is at the root of this.

https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/499402/ will work around this issue once wmf.25 lands; if there had not been issues with the train it would have already landed for this wiki.

daniel added a subscriber: daniel.EditedApr 10 2019, 1:34 PM

The fix depends on whether the content of that revision actually is a serialized flow board.

  • if it is, then this is an instance of the general problem of undeploying content handlers. I suppose registering a dummy content handler that passes through any data unchanged but only renders a message would be a viable approach. I have added T220608 for that.
  • if it is not, the correct thing would be to fix the database - that is, manually set the content_model field to the id of the correct content model.
wikiadmin@10.64.0.205(dewikiversity)> select * from text where old_id = 269201;
+--------+---------------+-----------+-----------------------+-------------+----------+---------------+---------------+----------------+---------------------+-------------------+
| old_id | old_namespace | old_title | old_text              | old_comment | old_user | old_user_text | old_timestamp | old_minor_edit | old_flags           | inverse_timestamp |
+--------+---------------+-----------+-----------------------+-------------+----------+---------------+---------------+----------------+---------------------+-------------------+
| 269201 |               |           | DB://cluster22/121006 |             |          |               |               |                | utf-8,gzip,external |                   |
+--------+---------------+-----------+-----------------------+-------------+----------+---------------+---------------+----------------+---------------------+-------------------+
1 row in set (0.00 sec)

I have gone to look at 121006 on cluster 22:

mysql:root@localhost [dewikiversity]> select blob_id, TO_BASE64(blob_text) from blobs_cluster22 where blob_id = 121006;
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| blob_id | TO_BASE64(blob_text)                                                                                                                                                                                                                                                            |
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|  121006 | bY2xCsIwFEX3fsUDN7fWrZBBQUS0ohRdxKGaq40JqaQJYtU/8VPc/DFbKWLV7d1zH+cyRo/bGmat
xEZS4YhD01CnibLlwZFTlHGncmLMY4zVFfllboDgG3Rq8Ec/htGFgPoZaL0r8sPPFDRSJ6y8C5gj
NIcFzRxUZXOa0xHCwmwf99RULY2Fli95u357kcrfBME3KFc8b7kcJRa7zAiE8SZ16mCyPaQNp+kp
F5JiyHI0Mbl1W9Dwcj7H8960O+hPulH/el2tng== |
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

The base64 decode and deflate of this is

== \xc3\x9cberblick zu den Inhalten des Moduls ==\n=== Inhalt 1 ===\n=== Inhalt 2 ===\n=== Inhalt 3 ===\n== \xc3\x9cberblick zu den Lernzielen des Moduls ==\n# Lernziel 1:\n# Lernziel 2:\n# Lernziel 3:\n== Verwendete Quellen und weiterf\xc3\xbchrende Links ==\n* Quelle Link 1:\n* Quelle Link 2:\n* Quelle Link 3:\n\n\n[[Kategorie:Schulprojekt:Physik Sekundarstufe I|{{SUBPAGENAME}}]]

That looks like raw wikitext to me.

Welp that change https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/499402/ doesn't have a large enough try block, because the line where we get the content model (which fails) is just before that. It seems to me that missing content models is a serious enough issue that I want to hear about it by having things broken.

That leaves us with manually updating the db (which ought to happen n any case); who should I tag for that?

That leaves us with manually updating the db (which ought to happen n any case); who should I tag for that?

Brave people?....

Maybe ask @tstarling :)

ArielGlenn moved this task from Backlog to Active on the Dumps-Generation board.Apr 15 2019, 9:54 AM

UnknownContentHandler now exists, it can be used like this:

$wgContentHandlers['xyzzy'] = 'UnknownContentHandler';

Change 614592 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] Consistently use UnknownContent to handle unknown content.

https://gerrit.wikimedia.org/r/614592

Change 614592 merged by jenkins-bot:
[mediawiki/core@master] Create fallback for undefined content models.

https://gerrit.wikimedia.org/r/614592