Page MenuHomePhabricator

Invariant failed: Bad UTF-8 at end of string (2 byte sequence)
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error

MediaWiki version: 1.35.0-wmf.14

message
Invariant failed: Bad UTF-8 at end of string (2 byte sequence)

Impact

Notes

Maybe should have been addressed as part of T237318

Details

Request ID
XhUcygpAAEYAAEyA0vQAAABJ
Request URL
https://sr.wiktionary.org/w/rest.php/sr.wiktionary.org/v3/page/pagebundle/%D1%80%D0%B8%D0%B1%D0%B8/582328
Stack Trace
exception.trace
#0 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Utils/PHPUtils.php(238): Wikimedia\Assert\Assert::invariant(boolean, string)
#1 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/PP/Processors/WrapSections.php(33): Parsoid\Utils\PHPUtils::safeSubstr(string, integer, integer)
#2 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/PP/Processors/WrapSections.php(329): Parsoid\Wt2Html\PP\Processors\WrapSections->getSrc(Parsoid\Wt2Html\PageConfigFrame, integer, integer)
#3 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/PP/Processors/WrapSections.php(452): Parsoid\Wt2Html\PP\Processors\WrapSections->resolveTplExtSectionConflicts(array)
#4 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/DOMPostProcessor.php(151): Parsoid\Wt2Html\PP\Processors\WrapSections->run(DOMElement, Parsoid\Config\Env, array, boolean)
#5 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/DOMPostProcessor.php(829): Parsoid\Wt2Html\DOMPostProcessor->Parsoid\Wt2Html\{closure}(DOMElement, Parsoid\Config\Env, array, boolean)
#6 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/DOMPostProcessor.php(882): Parsoid\Wt2Html\DOMPostProcessor->doPostProcess(DOMDocument)
#7 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/DOMPostProcessor.php(899): Parsoid\Wt2Html\DOMPostProcessor->process(DOMDocument)
#8 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/ParserPipeline.php(148): Parsoid\Wt2Html\DOMPostProcessor->processChunkily(string, array)
#9 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/ParserPipeline.php(198): Parsoid\Wt2Html\ParserPipeline->parseChunkily(string, array)
#10 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/ParserPipelineFactory.php(299): Parsoid\Wt2Html\ParserPipeline->parseToplevelDoc(string, array)
#11 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/WikitextContentModelHandler.php(78): Parsoid\Wt2Html\ParserPipelineFactory->parse(string)
#12 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Parsoid.php(96): Parsoid\WikitextContentModelHandler->toHTML(Parsoid\Config\Env)
#13 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Parsoid.php(127): Parsoid\Parsoid->parseWikitext(MWParsoid\Config\PageConfig, array)
#14 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/extension/src/Rest/Handler/ParsoidHandler.php(586): Parsoid\Parsoid->wikitext2html(MWParsoid\Config\PageConfig, array, NULL)
#15 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/extension/src/Rest/Handler/PageHandler.php(52): MWParsoid\Rest\Handler\ParsoidHandler->wt2html(Parsoid\Config\Env, array)
#16 /srv/mediawiki/php-1.35.0-wmf.14/includes/Rest/Router.php(314): MWParsoid\Rest\Handler\PageHandler->execute()
#17 /srv/mediawiki/php-1.35.0-wmf.14/includes/Rest/Router.php(285): MediaWiki\Rest\Router->executeHandler(MWParsoid\Rest\Handler\PageHandler)
#18 /srv/mediawiki/php-1.35.0-wmf.14/includes/Rest/EntryPoint.php(111): MediaWiki\Rest\Router->execute(MediaWiki\Rest\RequestFromGlobals)
#19 /srv/mediawiki/php-1.35.0-wmf.14/includes/Rest/EntryPoint.php(78): MediaWiki\Rest\EntryPoint->execute()
#20 /srv/mediawiki/php-1.35.0-wmf.14/rest.php(31): MediaWiki\Rest\EntryPoint::main()
#21 /srv/mediawiki/w/rest.php(3): require(string)
#22 {main}

Event Timeline

ssastry triaged this task as Medium priority.Apr 10 2020, 7:57 PM
hashar added a subscriber: hashar.

That regularly spams the log in short burst and it seems to always be for sr.wiktionary.org though on different pagebundle.

exception.trace
#0 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Utils/PHPUtils.php(218): Wikimedia\Assert\Assert::invariant(boolean, string)
#1 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/WrapSections.php(29): Wikimedia\Parsoid\Utils\PHPUtils::safeSubstr(string, integer, integer)
#2 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/WrapSections.php(382): Wikimedia\Parsoid\Wt2Html\PP\Processors\WrapSections->getSrc(Wikimedia\Parsoid\Wt2Html\PageConfigFrame, integer, integer)
#3 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/WrapSections.php(447): Wikimedia\Parsoid\Wt2Html\PP\Processors\WrapSections->resolveTplExtSectionConflicts(array)
#4 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(156): Wikimedia\Parsoid\Wt2Html\PP\Processors\WrapSections->run(Wikimedia\Parsoid\Config\Env, DOMElement, array, boolean)
#5 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(856): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->Wikimedia\Parsoid\Wt2Html\{closure}(DOMElement, array, boolean)
#6 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(905): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->doPostProcess(DOMDocument)
#7 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(922): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->process(DOMDocument)
#8 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/ParserPipeline.php(152): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->processChunkily(string, array)
#9 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/ParserPipeline.php(202): Wikimedia\Parsoid\Wt2Html\ParserPipeline->parseChunkily(string, array)
#10 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/ParserPipelineFactory.php(299): Wikimedia\Parsoid\Wt2Html\ParserPipeline->parseToplevelDoc(string, array)
#11 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Core/WikitextContentModelHandler.php(78): Wikimedia\Parsoid\Wt2Html\ParserPipelineFactory->parse(string)
#12 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Parsoid.php(152): Wikimedia\Parsoid\Core\WikitextContentModelHandler->toDOM(Wikimedia\Parsoid\Config\Env)
#13 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Parsoid.php(184): Wikimedia\Parsoid\Parsoid->parseWikitext(MWParsoid\Config\PageConfig, array)
#14 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/extension/src/Rest/Handler/ParsoidHandler.php(533): Wikimedia\Parsoid\Parsoid->wikitext2html(MWParsoid\Config\PageConfig, array, NULL)
#15 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/extension/src/Rest/Handler/PageHandler.php(66): MWParsoid\Rest\Handler\ParsoidHandler->wt2html(MWParsoid\Config\PageConfig, array)
#16 /srv/mediawiki/php-1.35.0-wmf.39/includes/Rest/Router.php(362): MWParsoid\Rest\Handler\PageHandler->execute()
#17 /srv/mediawiki/php-1.35.0-wmf.39/includes/Rest/Router.php(317): MediaWiki\Rest\Router->executeHandler(MWParsoid\Rest\Handler\PageHandler)
#18 /srv/mediawiki/php-1.35.0-wmf.39/includes/Rest/EntryPoint.php(139): MediaWiki\Rest\Router->execute(MediaWiki\Rest\RequestFromGlobals)
#19 /srv/mediawiki/php-1.35.0-wmf.39/includes/Rest/EntryPoint.php(106): MediaWiki\Rest\EntryPoint->execute()
#20 /srv/mediawiki/php-1.35.0-wmf.39/rest.php(31): MediaWiki\Rest\EntryPoint::main()
#21 /srv/mediawiki/w/rest.php(3): require(string)
#22 {main}

A common cause of errors like these is bad truncation logic that isn't UTF-8 aware. It's the main way that invalid UTF-8 has been generated historically, in MediaWiki.

Yes. In parsoid it is also caused by bad range information in the DSR/TSR/selective serialization code. Bogus offsets cause bad truncation. This didn't happen much in Parsoid/JS because our offsets were all UCS-16, so they would only trigger errors when they were splitting up a UCS-16 surrogate (very rare!). They happen much more often in Parsoid/PHP because PHP keeps all offsets in UTF-8, so a bogus offset is has about a 50% or 66% chance of causing an encoding error on a non-latin-script wiki.

Change 637590 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] Fix buggy DSR computation in section-template conflict resolution code

https://gerrit.wikimedia.org/r/637590

Change 637590 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Fix buggy DSR computation in section-template conflict resolution code

https://gerrit.wikimedia.org/r/637590

Change 646892 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.13.0-a19

https://gerrit.wikimedia.org/r/646892

Change 646892 merged by jenkins-bot:
[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.13.0-a19

https://gerrit.wikimedia.org/r/646892

Change 646771 had a related patch set uploaded (by C. Scott Ananian; owner: Subramanya Sastry):
[mediawiki/vendor@wmf/1.36.0-wmf.21] Bump wikimedia/parsoid to 0.13.0-a19

https://gerrit.wikimedia.org/r/646771

Change 646771 merged by jenkins-bot:
[mediawiki/vendor@wmf/1.36.0-wmf.21] Bump wikimedia/parsoid to 0.13.0-a19

https://gerrit.wikimedia.org/r/646771