Page MenuHomePhabricator

RemexHTML crasher
Closed, ResolvedPublic

Description

On https://www.mediawiki.org/wiki/Topic:U453n0lb207odpid, someone reported a test case that crashes Remex. Here is a slightly different test case:

<div><i><b>foo<b>
<b>foo</i>

When run locally, I get this stack trace:

[535b3bb0d1bfa6a399c0f92d] [no req]   RemexHtml\TreeBuilder\TreeBuilderError from line 304 of /home/subbu/work/wmf/mediawiki/vendor/wikimedia/remex-html/RemexHtml/TreeBuilder/CachingStack.php: RemexHtml\TreeBuilder\CachingStack::replace: cannot find old element in scope cache
Backtrace:
#0 /home/subbu/work/wmf/mediawiki/vendor/wikimedia/remex-html/RemexHtml/TreeBuilder/TreeBuilder.php(489): RemexHtml\TreeBuilder\CachingStack->replace(RemexHtml\TreeBuilder\Element, RemexHtml\TreeBuilder\Element)
#1 /home/subbu/work/wmf/mediawiki/vendor/wikimedia/remex-html/RemexHtml/TreeBuilder/InBody.php(621): RemexHtml\TreeBuilder\TreeBuilder->adoptionAgency(string, integer, integer)
#2 /home/subbu/work/wmf/mediawiki/vendor/wikimedia/remex-html/RemexHtml/TreeBuilder/Dispatcher.php(392): RemexHtml\TreeBuilder\InBody->endTag(string, integer, integer)
#3 /home/subbu/work/wmf/mediawiki/vendor/wikimedia/remex-html/RemexHtml/Tokenizer/Tokenizer.php(1305): RemexHtml\TreeBuilder\Dispatcher->endTag(string, integer, integer)
#4 /home/subbu/work/wmf/mediawiki/vendor/wikimedia/remex-html/RemexHtml/Tokenizer/Tokenizer.php(488): RemexHtml\Tokenizer\Tokenizer->handleAttribsAndClose(integer, string, boolean, integer)
#5 /home/subbu/work/wmf/mediawiki/vendor/wikimedia/remex-html/RemexHtml/Tokenizer/Tokenizer.php(310): RemexHtml\Tokenizer\Tokenizer->dataState(boolean)
#6 /home/subbu/work/wmf/mediawiki/vendor/wikimedia/remex-html/RemexHtml/Tokenizer/Tokenizer.php(152): RemexHtml\Tokenizer\Tokenizer->executeInternal(boolean)
#7 /home/subbu/work/wmf/mediawiki/includes/tidy/RemexDriver.php(53): RemexHtml\Tokenizer\Tokenizer->execute(array)
#8 /home/subbu/work/wmf/mediawiki/includes/parser/MWTidy.php(52): MediaWiki\Tidy\RemexDriver->tidy(string)
#9 /home/subbu/work/wmf/mediawiki/includes/parser/Parser.php(1381): MWTidy::tidy(string)
#10 /home/subbu/work/wmf/mediawiki/includes/parser/Parser.php(454): Parser->internalParseHalfParsed(string, boolean, boolean)
#11 /home/subbu/work/wmf/mediawiki/maintenance/parse.php(138): Parser->parse(string, Title, ParserOptions)
#12 /home/subbu/work/wmf/mediawiki/maintenance/parse.php(85): CLIParser->parse(string)
#13 /home/subbu/work/wmf/mediawiki/maintenance/parse.php(77): CLIParser->render(string)
#14 /home/subbu/work/wmf/mediawiki/maintenance/doMaintenance.php(94): CLIParser->execute()
#15 /home/subbu/work/wmf/mediawiki/maintenance/parse.php(144): require_once(string)
#16 {main}

Relatedly, on itwiki, this oldid crashes: https://it.wikipedia.org/wiki/Discussioni_utente:Nicelotus~itwiki?oldid=72298949 which looks like an instance of the same crasher above.

Event Timeline

ssastry triaged this task as High priority.Dec 20 2017, 3:51 PM

https://logstash.wikimedia.org/goto/d34dc5529f03feea26b0d80d8e68e2d0 is the logstash dashboard snapshot for last 30 days .. it looks like all the crasher instances are from last 24 hours and primarily from sandbox testing and the itwiki page.

I took a quick peek at the code. This looks like a bug in the CachingScopeStack possibly. If I use the SimpleStack (or if I comment out the exception throw in CachingScopeStack), the code runs to completion and generates the same output as domino or what I can see in the browser.

Change 399760 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/libs/RemexHtml@master] Fix linked list manipulation in CachedScopeStack

https://gerrit.wikimedia.org/r/399760

Change 399760 merged by jenkins-bot:
[mediawiki/libs/RemexHtml@master] Fix linked list manipulation in CachedScopeStack

https://gerrit.wikimedia.org/r/399760

Change 401412 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/libs/RemexHtml@master] Add test case for T183379

https://gerrit.wikimedia.org/r/401412

Change 401412 merged by jenkins-bot:
[mediawiki/libs/RemexHtml@master] Add test case for T183379

https://gerrit.wikimedia.org/r/401412

Change 401416 had a related patch set uploaded (by Tim Starling; owner: Tim Starling):
[mediawiki/vendor@master] Update wikimedia/remex-html to 1.0.2

https://gerrit.wikimedia.org/r/401416

Change 401416 merged by jenkins-bot:
[mediawiki/vendor@master] Update wikimedia/remex-html to 1.0.2

https://gerrit.wikimedia.org/r/401416

This will go out with the mediawiki train this week.