Page MenuHomePhabricator

PageTriage not truncating data correctly before inserting into ptrpt_value
Closed, ResolvedPublic

Description

We are seeing:

2020-05-29 09:56:16 jobrunner1 kakukokkawiki: [c05d6c15f744c23b4c079eda] [no req]   Wikimedia\Rdbms\DBQueryError from line 1603 of /srv/mediawiki/w/includes/libs/rdbms/database/Database.php: A database query error has occurred. Did you forget to run your application's database schema updater after upgrading? 
Query: REPLACE INTO `pagetriage_page_tags` (ptrpt_page_id,ptrpt_tag_id,ptrpt_value) VALUES ('5948','9','ロゴ サイト 運営 説明 団体 代表 管理者 百科 wikiwiki 新旧Wiki運営班 なかなか 通称旧Wiki。当Wikiへの移行が推進されているが、実際は新Wiki文法の困難さにより停滞気味でもある。 Miraheze 想像地図研究所 コミュニティ jimdo 日本架空国家協会(JAINA)…')
Function: MediaWiki\Extension\PageTriage\ArticleCompile\ArticleCompileProcessor::save
Error: 1406 Data too long for column 'ptrpt_value' at row 1 (dbt1.miraheze.org)

#0 /srv/mediawiki/w/includes/libs/rdbms/database/Database.php(1574): Wikimedia\Rdbms\Database->getQueryExceptionAndLog(string, integer, string, string)
#1 /srv/mediawiki/w/includes/libs/rdbms/database/Database.php(1152): Wikimedia\Rdbms\Database->reportQueryError(string, integer, string, string, boolean)
#2 /srv/mediawiki/w/includes/libs/rdbms/database/Database.php(2924): Wikimedia\Rdbms\Database->query(string, string)
#3 /srv/mediawiki/w/includes/libs/rdbms/database/DatabaseMysqlBase.php(452): Wikimedia\Rdbms\Database->nativeReplace(string, array, string)
#4 /srv/mediawiki/w/includes/libs/rdbms/database/DBConnRef.php(68): Wikimedia\Rdbms\DatabaseMysqlBase->replace(string, array, array, string)
#5 /srv/mediawiki/w/includes/libs/rdbms/database/DBConnRef.php(490): Wikimedia\Rdbms\DBConnRef->__call(string, array)
#6 /srv/mediawiki/w/extensions/PageTriage/includes/ArticleCompile/ArticleCompileProcessor.php(352): Wikimedia\Rdbms\DBConnRef->replace(string, array, array, string)
#7 /srv/mediawiki/w/extensions/PageTriage/includes/ArticleCompile/ArticleCompileProcessor.php(230): MediaWiki\Extension\PageTriage\ArticleCompile\ArticleCompileProcessor->save()
#8 /srv/mediawiki/w/extensions/PageTriage/includes/CompileArticleMetadataJob.php(48): MediaWiki\Extension\PageTriage\ArticleCompile\ArticleCompileProcessor->compileMetadata()
#9 /srv/mediawiki/w/includes/jobqueue/JobRunner.php(299): MediaWiki\Extension\PageTriage\CompileArticleMetadataJob->run()
#10 /srv/mediawiki/w/includes/jobqueue/JobRunner.php(192): JobRunner->executeJob(MediaWiki\Extension\PageTriage\CompileArticleMetadataJob, Wikimedia\Rdbms\LBFactoryMulti, BufferingStatsdDataFactory, integer)
#11 /srv/mediawiki/w/maintenance/runJobs.php(92): JobRunner->run(array)
#12 /srv/mediawiki/w/maintenance/doMaintenance.php(99): RunJobs->execute()
#13 /srv/mediawiki/w/maintenance/runJobs.php(129): require_once(string)
#14 {main}

in the logs.

This is because we have strict mode enabled in mysql. Thus the column is not silently truncated when inserting.

Event Timeline

Restricted Application added subscribers: RhinosF1, Reception123, Aklapper. · View Herald Transcript

Change 599947 had a related patch set uploaded (by Paladox; owner: Paladox):
[mediawiki/extensions/PageTriage@master] Convert ptrpt_value to a MEDIUMBLOB

https://gerrit.wikimedia.org/r/599947

I note the truncation doesn't seem to be working for other charsets?

At the end of ArticleCompileSnippet::generateArticleSnippet there is

		return $wgLang->truncateForVisual( $text, 150 );
Reedy renamed this task from Error: 1406 Data too long for column 'ptrpt_value' at row 1 to PageTriage not truncating data correctly before inserting into ptrpt_value.May 29 2020, 8:53 PM

I note the truncation doesn't seem to be working for other charsets?

At the end of ArticleCompileSnippet::generateArticleSnippet there is

		return $wgLang->truncateForVisual( $text, 150 );

But it doesn't seem to be called anywhere...

Ah. I see it's called with excessive abstraction/instantiation of a stringified class name...

				$compClass = 'MediaWiki\Extension\PageTriage\ArticleCompile\ArticleCompile' . $key;
				/** @var ArticleCompileInterface $comp */
				$comp = new $compClass( $this->pageIds, $this->componentDb[$key], $this->articles,
					$this->linksUpdates
				);

Change 601461 had a related patch set uploaded (by Kaldari; owner: Kaldari):
[mediawiki/extensions/PageTriage@master] Prevent "Data too long for column" error due to multi-byte characters

https://gerrit.wikimedia.org/r/601461

Change 601461 merged by jenkins-bot:
[mediawiki/extensions/PageTriage@master] Prevent "Data too long for column" error due to multi-byte characters

https://gerrit.wikimedia.org/r/601461

Change 601700 had a related patch set uploaded (by Reedy; owner: Kaldari):
[mediawiki/extensions/PageTriage@REL1_34] Prevent "Data too long for column" error due to multi-byte characters

https://gerrit.wikimedia.org/r/601700

Change 599947 abandoned by Reedy:
Convert ptrpt_value to a MEDIUMBLOB

Reason:
In favour of https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/PageTriage/ /601700/

https://gerrit.wikimedia.org/r/599947

Change 601700 merged by jenkins-bot:
[mediawiki/extensions/PageTriage@REL1_34] Prevent "Data too long for column" error due to multi-byte characters

https://gerrit.wikimedia.org/r/601700

kaldari claimed this task.