Page MenuHomePhabricator

html2wt: TypeError in ConstrainedText.php
Closed, ResolvedPublic

Description

Found just a single instance of this error in rt testing.

Exception msg:

Argument 1 passed to Parsoid\Html2Wt\ConstrainedText\ConstrainedText::fromSelSer() must be of the type string, boolean given, called in /srv/deployment/parsoid/deploy/src/src/Html2Wt/ConstrainedText/ConstrainedText.php on line 292

Exception url:

/w/rest.php/ar.wikipedia.org/v3/transform/pagebundle/to/wikitext/%D8%B1%D8%A7%D9%85%D8%B2%20%D8%AC%D9%84%D8%A7%D9%84/38325396

Exception trace:

#0 /srv/deployment/parsoid/deploy/src/src/Html2Wt/ConstrainedText/ConstrainedText.php(292): Parsoid\Html2Wt\ConstrainedText\ConstrainedText::fromSelSer(boolean, DOMElement, stdClass, Parsoid\Config\Env, array)
#1 [internal function]: Parsoid\Html2Wt\ConstrainedText\ConstrainedText::fromSelSerImpl(string, DOMElement, stdClass, Parsoid\Config\Env, array)
#2 /srv/deployment/parsoid/deploy/src/src/Html2Wt/ConstrainedText/ConstrainedText.php(208): call_user_func(array, string, DOMElement, stdClass, Parsoid\Config\Env, array)
#3 /srv/deployment/parsoid/deploy/src/src/Html2Wt/WikitextSerializer.php(1233): Parsoid\Html2Wt\ConstrainedText\ConstrainedText::fromSelSer(string, DOMElement, stdClass, Parsoid\Config\Env)
#4 [internal function]: Parsoid\Html2Wt\WikitextSerializer->serializeDOMNode(DOMElement, Parsoid\Html2Wt\DOMHandlers\ListHandler)
#5 /srv/deployment/parsoid/deploy/src/src/Html2Wt/WikitextSerializer.php(1373): call_user_func(array, DOMElement, Parsoid\Html2Wt\DOMHandlers\ListHandler)
#6 /srv/deployment/parsoid/deploy/src/src/Html2Wt/SerializerState.php(667): Parsoid\Html2Wt\WikitextSerializer->serializeNode(DOMElement)
#7 /srv/deployment/parsoid/deploy/src/src/Html2Wt/SerializerState.php(690): Parsoid\Html2Wt\SerializerState->serializeChildren(DOMElement, NULL)
#8 /srv/deployment/parsoid/deploy/src/src/Html2Wt/WikitextSerializer.php(1646): Parsoid\Html2Wt\SerializerState->kickOffSerialize(DOMElement)
#9 /srv/deployment/parsoid/deploy/src/src/Html2Wt/SelectiveSerializer.php(117): Parsoid\Html2Wt\WikitextSerializer->serializeDOM(DOMElement, boolean)
#10 /srv/deployment/parsoid/deploy/src/src/WikitextContentModelHandler.php(100): Parsoid\Html2Wt\SelectiveSerializer->serializeDOM(DOMElement)
#11 /srv/deployment/parsoid/deploy/src/src/Parsoid.php(149): Parsoid\WikitextContentModelHandler->fromHTML(Parsoid\Config\Env, DOMDocument, Parsoid\SelserData)
#12 /srv/deployment/parsoid/deploy/src/extension/src/Rest/Handler/ParsoidHandler.php(692): Parsoid\Parsoid->html2wikitext(MWParsoid\Config\PageConfig, Parsoid\PageBundle, array, Parsoid\SelserData)
#13 /srv/deployment/parsoid/deploy/src/extension/src/Rest/Handler/TransformHandler.php(78): MWParsoid\Rest\Handler\ParsoidHandler->html2wt(Parsoid\Config\Env, array, string)
#14 /srv/mediawiki/php-1.34.0-wmf.23/includes/Rest/Router.php(307): MWParsoid\Rest\Handler\TransformHandler->execute()
#15 /srv/mediawiki/php-1.34.0-wmf.23/includes/Rest/Router.php(286): MediaWiki\Rest\Router->executeHandler(MWParsoid\Rest\Handler\TransformHandler)
#16 /srv/mediawiki/php-1.34.0-wmf.23/includes/Rest/EntryPoint.php(112): MediaWiki\Rest\Router->execute(MediaWiki\Rest\RequestFromGlobals)
#17 /srv/mediawiki/php-1.34.0-wmf.23/includes/Rest/EntryPoint.php(79): MediaWiki\Rest\EntryPoint->execute()
#18 /srv/mediawiki/php-1.34.0-wmf.23/rest.php(31): MediaWiki\Rest\EntryPoint::main()
#19 /srv/mediawiki/w/rest.php(3): require(string)
#20 {main}

Event Timeline

ssastry triaged this task as Medium priority.Sep 25 2019, 2:06 PM
ssastry moved this task from Backlog to Bugs, Notices, Crashers on the Parsoid-PHP board.
			DOMUtils::assertElt( $lastChild ); // implied by $lastChildDp
			$len = $lastChildDp->dsr->length();
			$suffixChunks = self::fromSelSer(
				substr( $text, -$len ), $lastChild, $lastChildDp, $env,
				// this child node's left context will be protected:
				[ 'ignorePrefix' => true ]
			);

The substr method returns false on failure, which seems to be what's happening here. Probably ultimately a bad DSR in $lastChildDp->dsr.

Arlolra subscribed.

/w/rest.php/es.wikipedia.org/v3/transform/pagebundle/to/wikitext/Esterilizaci%C3%B3n_(microbiolog%C3%ADa)/123778395

ssastry raised the priority of this task from Medium to High.May 13 2020, 8:16 PM

In case an enwiki page makes it easier to debug, here is one from rt testing

ssastry@scandium:/srv/parsoid-testing$ node bin/roundtrip-test.js --proxyURL http://scandium.eqiad.wmnet:80 --parsoidURL http://DOMAIN/w/rest.php --domain en.wikipedia.org "1970 FA Cup Final" --oldid 956878467
Parser failure!

----------------------------------------------------------------------
Error: Got status code: 500; ...

https://en.wikipedia.org/w/index.php?title=User:SSastry_(WMF)/sandbox&oldid=958030712 is the minimal test that reproduces the bug:

{|
*x
*y
|}

On scandium:

ssastry@scandium:/srv/parsoid-testing$ node bin/roundtrip-test.js --proxyURL http://scandium.eqiad.wmnet:80 --parsoidURL http://DOMAIN/w/rest.php --domain en.wikipedia.org "User:SSastry (WMF)/sandbox" --oldid 958030712
Parser failure!

----------------------------------------------------------------------
Error: Got status code: 500;  ...

Change 597889 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] WIP: Don't crash on bad DSR

https://gerrit.wikimedia.org/r/597889

Change 597889 merged by jenkins-bot:
[mediawiki/services/parsoid@master] html2wt in selser mode: Don't crash on bad DSR

https://gerrit.wikimedia.org/r/597889

Change 603571 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/vendor@master] Bump Parsoid to v0.12.0-a16

https://gerrit.wikimedia.org/r/603571

Change 603571 merged by jenkins-bot:
[mediawiki/vendor@master] Bump Parsoid to v0.12.0-a16

https://gerrit.wikimedia.org/r/603571