This breaks stubs dumps for a number of wikis.
A tale of two pages:
Page 1 is 8821 on sa wikisource, and here is its info:
` wikiadmin@10.64.48.35(sawikisource)> select page_id, page_namespace, page_title, page_latest, page_is_redirect from page where page_id = 8821; +---------+----------------+--------------------------------------------+-------------+------------------+ | page_id | page_namespace | page_title | page_latest | page_is_redirect | +---------+----------------+--------------------------------------------+-------------+------------------+ | 8821 | 104 | Kumarasambhavam_-_Mallinatha_-_1888.djvu/5 | 157361 | 0 | +---------+----------------+--------------------------------------------+-------------+------------------+
Page 2 is 8829 on sa wikisource and here is its info:
wikiadmin@10.64.48.35(sawikisource)> select page_id, page_namespace, page_title, page_latest, page_is_redirect from page where page_id = 8829; +---------+----------------+------------------------------------------------------------------+-------------+------------------+ | page_id | page_namespace | page_title | page_latest | page_is_redirect | +---------+----------------+------------------------------------------------------------------+-------------+------------------+ | 8829 | 0 | पृष्ठम्:Kumarasambhavam_-_Mallinatha_-_1888.djvu/5 | 28418 | 1 | +---------+----------------+------------------------------------------------------------------+-------------+------------------+ 1 row in set (0.00 sec)
Note that the prefix पृष्ठम् is in fact the name of namespace 104, you can check it yourself by looking at https://sa.wikisource.org/w/api.php?action=query&meta=siteinfo&siprop=general%7Cnamespaces%7Cnamespacealiases%7Cstatistics and json decoding the string. (Or maybe there's a faster way.)
When we dump the stubs, we wind up grabbing a number of pages in a batch rather than asking the db for each one separately. Speed and all that. In the current case we ask for a batch that includes the range 8821 through 8829; these all have very few revisions so it's no burden for the servers. BUT...
We process all the revisions up to the first one for page 8829. Items go into the link cache during processing.
Now we start work on page 8829:
- openPage makes a Title from the selected row.
- If the page is a redirect, it will get a WikiPage object for the Title object and then call that class's getRedirectTarget() .
- This method first checks to see if the page is a redirect, logical enough: if ( !$this->mTitle->isRedirect() ) {
- Title->isRedirect() calls getArticleID() on the Title object, with an argument of 0.
- Because at this point the article ID has not been set in the Title Object, we wind up at $this->mArticleID = $linkCache->addLinkObj( $this );
- We're going to look up the info in the link cache. Guess what the key is: पृष्ठम्:Kumarasambhavam_-_Mallinatha_-_1888.djvu/5
- And guess what page id it has: 8821.
BOOM.
This gets caught in the constructor for RevisionStoreRecord.php whch gets passed a revision row for page id 8829, with a title claiming now to be for page id 8821.
[7381d9d77b7aef61403caffe] [no req] InvalidArgumentException from line 100 of /srv/mediawiki_atg/php-1.33.0-wmf.23/includes/Revision/RevisionStoreRecord.php: The given Title does not belong to page ID 8829 but actually belongs to 8821 Backtrace: #0 /srv/mediawiki_atg/php-1.33.0-wmf.23/includes/Revision/RevisionStore.php(1820): MediaWiki\Revision\RevisionStoreRecord->__construct(Title, User, CommentStoreComment, stdClass, MediaWiki\Revision\RevisionSlots, boolean) #1 /srv/mediawiki_atg/php-1.33.0-wmf.23/includes/export/XmlDumpWriter.php(332): MediaWiki\Revision\RevisionStore->newRevisionFromRow(stdClass, integer, Title) #2 /srv/mediawiki_atg/php-1.33.0-wmf.23/includes/export/WikiExporter.php(485): XmlDumpWriter->writeRevision(stdClass) #3 /srv/mediawiki_atg/php-1.33.0-wmf.23/includes/export/WikiExporter.php(445): WikiExporter->outputPageStreamBatch(Wikimedia\Rdbms\ResultWrapper, stdClass) #4 /srv/mediawiki_atg/php-1.33.0-wmf.23/includes/export/WikiExporter.php(269): WikiExporter->dumpPages(string, boolean) #5 /srv/mediawiki_atg/php-1.33.0-wmf.23/includes/export/WikiExporter.php(154): WikiExporter->dumpFrom(string, boolean) #6 /srv/mediawiki_atg/php-1.33.0-wmf.23/maintenance/includes/BackupDumper.php(288): WikiExporter->pagesByRange(integer, integer, boolean) #7 /srv/mediawiki_atg/php-1.33.0-wmf.23/maintenance/dumpBackup.php(81): BackupDumper->dump(integer, integer) #8 /srv/mediawiki_atg/php-1.33.0-wmf.23/maintenance/doMaintenance.php(94): DumpBackup->execute() #9 /srv/mediawiki_atg/php-1.33.0-wmf.23/maintenance/dumpBackup.php(138): require_once(string) #10 /srv/mediawiki_atg/multiversion/MWScript.php(100): require_once(string) #11 {main}
I have no idea what the right fix is.