Error
MediaWiki version: 1.35.0-wmf.14
Invariant failed: Bad UTF-8 at end of string (2 byte sequence)
Impact
Notes
Maybe should have been addressed as part of T237318
MediaWiki version: 1.35.0-wmf.14
Invariant failed: Bad UTF-8 at end of string (2 byte sequence)
Maybe should have been addressed as part of T237318
#0 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Utils/PHPUtils.php(238): Wikimedia\Assert\Assert::invariant(boolean, string) #1 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/PP/Processors/WrapSections.php(33): Parsoid\Utils\PHPUtils::safeSubstr(string, integer, integer) #2 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/PP/Processors/WrapSections.php(329): Parsoid\Wt2Html\PP\Processors\WrapSections->getSrc(Parsoid\Wt2Html\PageConfigFrame, integer, integer) #3 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/PP/Processors/WrapSections.php(452): Parsoid\Wt2Html\PP\Processors\WrapSections->resolveTplExtSectionConflicts(array) #4 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/DOMPostProcessor.php(151): Parsoid\Wt2Html\PP\Processors\WrapSections->run(DOMElement, Parsoid\Config\Env, array, boolean) #5 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/DOMPostProcessor.php(829): Parsoid\Wt2Html\DOMPostProcessor->Parsoid\Wt2Html\{closure}(DOMElement, Parsoid\Config\Env, array, boolean) #6 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/DOMPostProcessor.php(882): Parsoid\Wt2Html\DOMPostProcessor->doPostProcess(DOMDocument) #7 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/DOMPostProcessor.php(899): Parsoid\Wt2Html\DOMPostProcessor->process(DOMDocument) #8 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/ParserPipeline.php(148): Parsoid\Wt2Html\DOMPostProcessor->processChunkily(string, array) #9 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/ParserPipeline.php(198): Parsoid\Wt2Html\ParserPipeline->parseChunkily(string, array) #10 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Wt2Html/ParserPipelineFactory.php(299): Parsoid\Wt2Html\ParserPipeline->parseToplevelDoc(string, array) #11 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/WikitextContentModelHandler.php(78): Parsoid\Wt2Html\ParserPipelineFactory->parse(string) #12 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Parsoid.php(96): Parsoid\WikitextContentModelHandler->toHTML(Parsoid\Config\Env) #13 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/src/Parsoid.php(127): Parsoid\Parsoid->parseWikitext(MWParsoid\Config\PageConfig, array) #14 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/extension/src/Rest/Handler/ParsoidHandler.php(586): Parsoid\Parsoid->wikitext2html(MWParsoid\Config\PageConfig, array, NULL) #15 /srv/deployment/parsoid/deploy-cache/revs/45a4245d5f122ee190adf5465b36c4741c8bd330/src/extension/src/Rest/Handler/PageHandler.php(52): MWParsoid\Rest\Handler\ParsoidHandler->wt2html(Parsoid\Config\Env, array) #16 /srv/mediawiki/php-1.35.0-wmf.14/includes/Rest/Router.php(314): MWParsoid\Rest\Handler\PageHandler->execute() #17 /srv/mediawiki/php-1.35.0-wmf.14/includes/Rest/Router.php(285): MediaWiki\Rest\Router->executeHandler(MWParsoid\Rest\Handler\PageHandler) #18 /srv/mediawiki/php-1.35.0-wmf.14/includes/Rest/EntryPoint.php(111): MediaWiki\Rest\Router->execute(MediaWiki\Rest\RequestFromGlobals) #19 /srv/mediawiki/php-1.35.0-wmf.14/includes/Rest/EntryPoint.php(78): MediaWiki\Rest\EntryPoint->execute() #20 /srv/mediawiki/php-1.35.0-wmf.14/rest.php(31): MediaWiki\Rest\EntryPoint::main() #21 /srv/mediawiki/w/rest.php(3): require(string) #22 {main}
That regularly spams the log in short burst and it seems to always be for sr.wiktionary.org though on different pagebundle.
#0 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Utils/PHPUtils.php(218): Wikimedia\Assert\Assert::invariant(boolean, string) #1 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/WrapSections.php(29): Wikimedia\Parsoid\Utils\PHPUtils::safeSubstr(string, integer, integer) #2 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/WrapSections.php(382): Wikimedia\Parsoid\Wt2Html\PP\Processors\WrapSections->getSrc(Wikimedia\Parsoid\Wt2Html\PageConfigFrame, integer, integer) #3 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/WrapSections.php(447): Wikimedia\Parsoid\Wt2Html\PP\Processors\WrapSections->resolveTplExtSectionConflicts(array) #4 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(156): Wikimedia\Parsoid\Wt2Html\PP\Processors\WrapSections->run(Wikimedia\Parsoid\Config\Env, DOMElement, array, boolean) #5 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(856): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->Wikimedia\Parsoid\Wt2Html\{closure}(DOMElement, array, boolean) #6 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(905): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->doPostProcess(DOMDocument) #7 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(922): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->process(DOMDocument) #8 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/ParserPipeline.php(152): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->processChunkily(string, array) #9 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/ParserPipeline.php(202): Wikimedia\Parsoid\Wt2Html\ParserPipeline->parseChunkily(string, array) #10 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Wt2Html/ParserPipelineFactory.php(299): Wikimedia\Parsoid\Wt2Html\ParserPipeline->parseToplevelDoc(string, array) #11 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Core/WikitextContentModelHandler.php(78): Wikimedia\Parsoid\Wt2Html\ParserPipelineFactory->parse(string) #12 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Parsoid.php(152): Wikimedia\Parsoid\Core\WikitextContentModelHandler->toDOM(Wikimedia\Parsoid\Config\Env) #13 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/src/Parsoid.php(184): Wikimedia\Parsoid\Parsoid->parseWikitext(MWParsoid\Config\PageConfig, array) #14 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/extension/src/Rest/Handler/ParsoidHandler.php(533): Wikimedia\Parsoid\Parsoid->wikitext2html(MWParsoid\Config\PageConfig, array, NULL) #15 /srv/mediawiki/php-1.35.0-wmf.39/vendor/wikimedia/parsoid/extension/src/Rest/Handler/PageHandler.php(66): MWParsoid\Rest\Handler\ParsoidHandler->wt2html(MWParsoid\Config\PageConfig, array) #16 /srv/mediawiki/php-1.35.0-wmf.39/includes/Rest/Router.php(362): MWParsoid\Rest\Handler\PageHandler->execute() #17 /srv/mediawiki/php-1.35.0-wmf.39/includes/Rest/Router.php(317): MediaWiki\Rest\Router->executeHandler(MWParsoid\Rest\Handler\PageHandler) #18 /srv/mediawiki/php-1.35.0-wmf.39/includes/Rest/EntryPoint.php(139): MediaWiki\Rest\Router->execute(MediaWiki\Rest\RequestFromGlobals) #19 /srv/mediawiki/php-1.35.0-wmf.39/includes/Rest/EntryPoint.php(106): MediaWiki\Rest\EntryPoint->execute() #20 /srv/mediawiki/php-1.35.0-wmf.39/rest.php(31): MediaWiki\Rest\EntryPoint::main() #21 /srv/mediawiki/w/rest.php(3): require(string) #22 {main}
A common cause of errors like these is bad truncation logic that isn't UTF-8 aware. It's the main way that invalid UTF-8 has been generated historically, in MediaWiki.
Yes. In parsoid it is also caused by bad range information in the DSR/TSR/selective serialization code. Bogus offsets cause bad truncation. This didn't happen much in Parsoid/JS because our offsets were all UCS-16, so they would only trigger errors when they were splitting up a UCS-16 surrogate (very rare!). They happen much more often in Parsoid/PHP because PHP keeps all offsets in UTF-8, so a bogus offset is has about a 50% or 66% chance of causing an encoding error on a non-latin-script wiki.
Change 637590 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] Fix buggy DSR computation in section-template conflict resolution code
Change 637590 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Fix buggy DSR computation in section-template conflict resolution code
Change 646892 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.13.0-a19
Change 646892 merged by jenkins-bot:
[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.13.0-a19
Change 646771 had a related patch set uploaded (by C. Scott Ananian; owner: Subramanya Sastry):
[mediawiki/vendor@wmf/1.36.0-wmf.21] Bump wikimedia/parsoid to 0.13.0-a19
Change 646771 merged by jenkins-bot:
[mediawiki/vendor@wmf/1.36.0-wmf.21] Bump wikimedia/parsoid to 0.13.0-a19