Page MenuHomePhabricator

MWException from line 974 of WikiImporter.php: Missing text field in import
Open, Needs TriagePublic

Description

When running import-wikitech.sh manually for T292342: Wikitech and wikitech-static out of sync...

23900 (61.26 pages/sec 61.26 revs/sec)
24000 (60.81 pages/sec 60.81 revs/sec)
Done!
You might want to run rebuildrecentchanges.php to regenerate RecentChanges,
and initSiteStats.php to update page and revision counts
100 (40.26 pages/sec 40.26 revs/sec)
200 (69.97 pages/sec 69.97 revs/sec)
MWException from line 974 of /srv/mediawiki/w/includes/import/WikiImporter.php: Missing text field in import.
#0 /srv/mediawiki/w/includes/import/WikiImporter.php(1134): WikiImporter->makeContent(Object(Title), '830', Array)
#1 /srv/mediawiki/w/includes/import/WikiImporter.php(1111): WikiImporter->processUpload(Array, Array)
#2 /srv/mediawiki/w/includes/import/WikiImporter.php(863): WikiImporter->handleUpload(Array)
#3 /srv/mediawiki/w/includes/import/WikiImporter.php(678): WikiImporter->handlePage()
#4 /srv/mediawiki/w/maintenance/importDump.php(353): WikiImporter->doImport()
#5 /srv/mediawiki/w/maintenance/importDump.php(286): BackupReader->importFromHandle(Resource id #741)
#6 /srv/mediawiki/w/maintenance/importDump.php(130): BackupReader->importFromFile('compress.zlib:/...')
#7 /srv/mediawiki/w/maintenance/doMaintenance.php(112): BackupReader->execute()
#8 /srv/mediawiki/w/maintenance/importDump.php(358): require_once('/srv/mediawiki/...')
#9 {main}
Rebuilding $wgRCMaxAge=7776000 seconds (90 days)

I'm guessing it's from the step of php maintenance/importDump.php --uploads /srv/imports/labswiki-${DATE}.xml.gz

Event Timeline

I don't know who "owns" the wikitech sync setup these days. WMCS nominally, I suppose?

Yeah.

Not sure at this point whether there's something odd in either the export or the import script (ie a MW bug, which seems possibly more likely), or something wikitech/wikitech-static specific.

I hit this bug after applying https://gerrit.wikimedia.org/r/c/mediawiki/core/+/787858 to REL1_35 (which was merged due to T288423). However, the patch provided in T288423#7268918 didn't trigger this bug.

Workaround on REL1_35 (provided you're running REL1_35 with change 787858 merged):

diff --git a/includes/import/WikiImporter.php b/includes/import/WikiImporter.php
index 3b53a1703d..3908082106 100644
--- a/includes/import/WikiImporter.php
+++ b/includes/import/WikiImporter.php
@@ -1070,12 +1070,13 @@ class WikiImporter {
                $revision = new WikiRevision( $this->config );
                $revId = $pageInfo['id'];
                $title = $pageInfo['_title'];
-               $content = $this->makeContent( $title, $revId, $uploadInfo );
+               // $content = $this->makeContent( $title, $revId, $uploadInfo );
+               $text = $uploadInfo['text'] ?? '';

                $revision->setTitle( $title );
                $revision->setID( $revId );
                $revision->setTimestamp( $uploadInfo['timestamp'] );
-               $revision->setContent( SlotRecord::MAIN, $content );
+               $revision->setContent( SlotRecord::MAIN, ContentHandler::makeContent( $text, $title ) );
                $revision->setFilename( $uploadInfo['filename'] );
                if ( isset( $uploadInfo['archivename'] ) ) {
                        $revision->setArchiveName( $uploadInfo['archivename'] );

What is the impact of this workaround, no revision history on file pages?

PoC (from dumpBackup.php with --include-files and --uploads, images were imported on this wiki from another wiki): P30997

The exception is thrown in WikiImporter->makeContent() because $uploadInfo does not use contain a 'text' key. In the past, a 'text' key was added if the <upload> element didn't have a 'text' key, but this has not been the case anymore since 787858. $uploadInfo is constructed in WikiImporter->handleUpload() based on the contents of the <upload> element, and since it can't find a 'text' tag inside the <upload> element, it won't add any 'text' key to $uploadInfo either.

WikiExporter->writeUpload() will add almost all tags to the <upload> element that are defined in $normalFields inside WikiImporter->handleUpload(), except for 'text'. Allegedly, the 'text' tag should have existed since introduction of image imports (2008), but even back then it wasn't there. I haven't been able to find a MediaWiki version where the <upload> element contained a 'text' tag. The "set text to '' if it's not found" code was a workaround that was introduced in r83233.

Workaround
Setting the 'text' key to an empty string avoids errors. This is a partial restore of this patch.

diff --git a/includes/import/WikiImporter.php b/includes/import/WikiImporter.php
index d40a39c..ce58c0a 100644
--- a/includes/import/WikiImporter.php
+++ b/includes/import/WikiImporter.php
@@ -1209,6 +1209,7 @@ class WikiImporter {
 		$revision = new WikiRevision( $this->config );
 		$revId = $pageInfo['id'];
 		$title = $pageInfo['_title'];
+		$uploadInfo['text'] = $uploadInfo['text'] ?? '';
 		$content = $this->makeContent( $title, $revId, $uploadInfo );
 
 		$revision->setTitle( $title );

Change 812521 had a related patch set uploaded (by Southparkfan; author: Southparkfan):

[mediawiki/core@master] WikiImporter: do not fail if upload entry in dump misses 'text' tag

https://gerrit.wikimedia.org/r/812521

Change 812521 merged by jenkins-bot:

[mediawiki/core@master] WikiImporter: do not fail if upload entry in dump lacks 'text' tag

https://gerrit.wikimedia.org/r/812521

Change 890451 had a related patch set uploaded (by Reedy; author: Southparkfan):

[mediawiki/core@REL1_39] WikiImporter: do not fail if upload entry in dump lacks 'text' tag

https://gerrit.wikimedia.org/r/890451

Change 890452 had a related patch set uploaded (by Reedy; author: Southparkfan):

[mediawiki/core@REL1_38] WikiImporter: do not fail if upload entry in dump lacks 'text' tag

https://gerrit.wikimedia.org/r/890452

Change 890453 had a related patch set uploaded (by Reedy; author: Southparkfan):

[mediawiki/core@REL1_35] WikiImporter: do not fail if upload entry in dump lacks 'text' tag

https://gerrit.wikimedia.org/r/890453

Change 890453 merged by jenkins-bot:

[mediawiki/core@REL1_35] WikiImporter: do not fail if upload entry in dump lacks 'text' tag

https://gerrit.wikimedia.org/r/890453

Change 890452 merged by jenkins-bot:

[mediawiki/core@REL1_38] WikiImporter: do not fail if upload entry in dump lacks 'text' tag

https://gerrit.wikimedia.org/r/890452

Change 890451 merged by jenkins-bot:

[mediawiki/core@REL1_39] WikiImporter: do not fail if upload entry in dump lacks 'text' tag

https://gerrit.wikimedia.org/r/890451