Page MenuHomePhabricator

UTF-8 validity assertion failure
Closed, ResolvedPublic

Description

See transcript below:

[subbu@earth:~/work/wmf/parsoid] cat /tmp/wt
<poem>
{{lang-my-Mymr|ၶိူဝ်းႁဝ်ၶိူဝ်းရႃႇၸႃႇ}}
</poem>

[subbu@earth:~/work/wmf/parsoid] php bin/parse.php < /tmp/wt
Wikimedia\Assert\InvariantException from line 159 of /home/subbu/work/wmf/parsoid/vendor/wikimedia/assert/src/Assert.php: Invariant failed: Bad UTF-8 at end of string (3 byte sequence)
#0 /home/subbu/work/wmf/parsoid/src/Utils/PHPUtils.php(206): Wikimedia\Assert\Assert::invariant(false, 'Bad UTF-8 at en...')
#1 /home/subbu/work/wmf/parsoid/src/Tokens/SourceRange.php(81): Parsoid\Utils\PHPUtils::safeSubstr('<poem>\n{{lang-m...', 21, 63)
#2 /home/subbu/work/wmf/parsoid/src/Wt2Html/TT/TemplateHandler.php(891): Parsoid\Tokens\SourceRange->substr('<poem>\n{{lang-m...')
#3 /home/subbu/work/wmf/parsoid/src/Wt2Html/TT/TemplateHandler.php(760): Parsoid\Wt2Html\TT\TemplateHandler->getArgInfo(Array)
...

Without the poem-extension wrapper, the assertion doesn't trigger

Event Timeline

ssastry triaged this task as Medium priority.Sep 3 2019, 9:52 PM
ssastry moved this task from Backlog to Bugs, Notices, Crashers on the Parsoid-PHP board.

Change 536316 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/services/parsoid@master] WIP: clear invalid DSRs in <poem>

https://gerrit.wikimedia.org/r/536316

Change 536316 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Clear invalid DSRs in <poem>

https://gerrit.wikimedia.org/r/536316