Page MenuHomePhabricator

Parsoid crashes when wikilink is embedded in a wikilink that in turn comes from a template
Closed, ResolvedPublicPRODUCTION ERROR

Description

Stack Trace:
from /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Wt2Html/TT/WikiLinkHandler.php(332)
#0 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Wt2Html/TT/WikiLinkHandler.php(332): MWExceptionHandler::handleError(integer, string, string, integer, array)
#1 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Wt2Html/TT/WikiLinkHandler.php(1720): Wikimedia\Parsoid\Wt2Html\TT\WikiLinkHandler->onWikiLink(Wikimedia\Parsoid\Tokens\TagTk)
#2 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Wt2Html/TT/TokenHandler.php(154): Wikimedia\Parsoid\Wt2Html\TT\WikiLinkHandler->onTag(Wikimedia\Parsoid\Tokens\TagTk)
#3 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Wt2Html/TokenTransformManager.php(109): Wikimedia\Parsoid\Wt2Html\TT\TokenHandler->process(array)
#4 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Wt2Html/TokenTransformManager.php(153): Wikimedia\Parsoid\Wt2Html\TokenTransformManager->processChunk(array)
#5 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Wt2Html/TokenTransformManager.php(151): Wikimedia\Parsoid\Wt2Html\TokenTransformManager->processChunkily(string, array)
#6 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Wt2Html/TreeBuilder/TreeBuilderStage.php(487): Wikimedia\Parsoid\Wt2Html\TokenTransformManager->processChunkily(string, array)
#7 [internal function]: Wikimedia\Parsoid\Wt2Html\TreeBuilder\TreeBuilderStage->processChunkily(string, array)
#8 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(1041): Generator->current()
#9 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Wt2Html/ParserPipeline.php(180): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->processChunkily(string, array)
#10 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Wt2Html/ParserPipelineFactory.php(308): Wikimedia\Parsoid\Wt2Html\ParserPipeline->parseChunkily(string, array)
#11 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Core/WikitextContentModelHandler.php(106): Wikimedia\Parsoid\Wt2Html\ParserPipelineFactory->parse(string)
#12 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Parsoid.php(166): Wikimedia\Parsoid\Core\WikitextContentModelHandler->toDOM(Wikimedia\Parsoid\Config\Env)
#13 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/src/Parsoid.php(198): Wikimedia\Parsoid\Parsoid->parseWikitext(MWParsoid\Config\PageConfig, array)
#14 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/extension/src/Rest/Handler/ParsoidHandler.php(584): Wikimedia\Parsoid\Parsoid->wikitext2html(MWParsoid\Config\PageConfig, array, NULL)
#15 /srv/mediawiki/php-1.38.0-wmf.17/vendor/wikimedia/parsoid/extension/src/Rest/Handler/PageHandler.php(88): MWParsoid\Rest\Handler\ParsoidHandler->wt2html(MWParsoid\Config\PageConfig, array)
#16 /srv/mediawiki/php-1.38.0-wmf.17/includes/Rest/Router.php(414): MWParsoid\Rest\Handler\PageHandler->execute()
#17 /srv/mediawiki/php-1.38.0-wmf.17/includes/Rest/Router.php(338): MediaWiki\Rest\Router->executeHandler(MWParsoid\Rest\Handler\PageHandler)
#18 /srv/mediawiki/php-1.38.0-wmf.17/includes/Rest/EntryPoint.php(167): MediaWiki\Rest\Router->execute(MediaWiki\Rest\RequestFromGlobals)
#19 /srv/mediawiki/php-1.38.0-wmf.17/includes/Rest/EntryPoint.php(132): MediaWiki\Rest\EntryPoint->execute()
#20 /srv/mediawiki/php-1.38.0-wmf.17/rest.php(31): MediaWiki\Rest\EntryPoint::main()
#21 /srv/mediawiki/w/rest.php(3): require(string)
#22 {main}

Details

Request URL
https://en.wikipedia.org/w/rest.php/en.wikipedia.org/v3/page/pagebundle/Wikipedia%3AWikiProject_Military_history%2FAssessment%2F2009%2FFailed/405916441

Event Timeline

Krinkle renamed this task from PHP Notice: Trying to get property 'v' of non-object to PHP Notice: Trying to get property 'v' of non-object (from Parsoid WikiLinkHandler).Jan 14 2022, 4:09 PM
Krinkle updated the task description. (Show Details)
Krinkle moved this task from Untriaged to Jan 2022 on the Wikimedia-production-error board.

The following works just fine

[subbu@earth:~/work/wmf/parsoid] php bin/parse.php --pageName 'Wikipedia:WikiProject Military history/Assessment/K. Subrahmanyam' < /dev/null > /dev/null

But, the crasher is reproducible with:

[subbu@earth:~/work/wmf/parsoid] echo '{{Wikipedia:WikiProject Military history/Assessment/K. Subrahmanyam}}' | php bin/parse.php
PHP Notice:  Trying to get property 'v' of non-object in /home/subbu/work/wmf/parsoid/src/Wt2Html/TT/WikiLinkHandler.php on line 332
Wikimedia\Assert\InvariantException from line 224 of /home/subbu/work/wmf/parsoid/vendor/wikimedia/assert/src/Assert.php: Invariant failed: No nulls expected.
#0 /home/subbu/work/wmf/parsoid/src/Utils/PHPUtils.php(390): Wikimedia\Assert\Assert::invariant()
...

819630e57 is the first patch that triggers this, but that might just be exposing a latent bug.

This non-crashing snippet demonstrates the problem.

[subbu@earth:~/work/wmf/parsoid] echo '[[http://www.alexa.com/topsites/countries/IN| [[Rediff.com]]]]' | php bin/parse.php
<p data-parsoid='{"dsr":[0,62,0,0]}'>[<a rel="mw:ExtLink" href="http://www.alexa.com/topsites/countries/IN%7C" class="external text" data-parsoid='{"a":{"href":"http://www.alexa.com/topsites/countries/IN%7C"},"sa":{"href":"http://www.alexa.com/topsites/countries/IN|"},"dsr":[1,61,45,1]}'><wikilink data-parsoid='{"src":"[[Rediff.com]]","a":{"href":null},"sa":{"href":"Rediff.com"},"dsr":[46,60,null,null]}'></wikilink></a>]</p>

The wikilink-in-wikilink embedding messes up the pipeline and the embedded wikilink doesn't get expanded which then trips up later passes when this is embdded in a transclusion

[subbu@earth:~/work/wmf/parsoid] echo '{{1x|[[http://www.alexa.com/topsites/countries/IN| [[Rediff.com]]]]}}' | php bin/parse.php
PHP Notice:  Trying to get property 'v' of non-object in /home/subbu/work/wmf/parsoid/src/Wt2Html/TT/WikiLinkHandler.php on line 332
...

This is triggered only after 819630e57 because the changes to the treebuilder seems to emit unknown tokens in xml form. So, that patch just exposed an existing bug.

ssastry renamed this task from PHP Notice: Trying to get property 'v' of non-object (from Parsoid WikiLinkHandler) to Parsoid crashes when wikilink is embedded in a wikilink that in turn comes from a template.Jan 14 2022, 7:21 PM

This is broken markup of course, but Parsoid shouldn't crash.

ssastry triaged this task as Medium priority.Jan 14 2022, 7:22 PM

Change 755039 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/services/parsoid@master] WIP: Ensure tokens are always fully processed

https://gerrit.wikimedia.org/r/755039

Change 755039 merged by jenkins-bot:

[mediawiki/services/parsoid@master] WikiLinkHandler: Reprocess incorrect tokens to the right stage

https://gerrit.wikimedia.org/r/755039

Change 756638 had a related patch set uploaded (by Sbailey; author: Sbailey):

[mediawiki/vendor@master] Bump Parsoid to 0.15.0-a17

https://gerrit.wikimedia.org/r/756638

Change 756638 merged by jenkins-bot:

[mediawiki/vendor@master] Bump Parsoid to 0.15.0-a17

https://gerrit.wikimedia.org/r/756638

ssastry claimed this task.