Page MenuHomePhabricator

PHP Notice: Trying to get property 'parsoid' of non-object
Closed, ResolvedPublicPRODUCTION ERROR



Looks like $out['pb'] is coming out null or something else. This causes other downstream errors

PHP Notice: Trying to get property 'parsoid' of non-object


Request ID
Request URL
Stack Trace
#0 /srv/deployment/parsoid/deploy-cache/revs/aa59ce3d0aa035504666a63c99667398d0ea1928/src/src/Parsoid.php(130): MWExceptionHandler::handleError(integer, string, string, integer, array)
#1 /srv/deployment/parsoid/deploy-cache/revs/aa59ce3d0aa035504666a63c99667398d0ea1928/src/extension/src/Rest/Handler/ParsoidHandler.php(591): Parsoid\Parsoid->wikitext2html(MWParsoid\Config\PageConfig, array, array)
#2 /srv/deployment/parsoid/deploy-cache/revs/aa59ce3d0aa035504666a63c99667398d0ea1928/src/extension/src/Rest/Handler/PageHandler.php(47): MWParsoid\Rest\Handler\ParsoidHandler->wt2html(Parsoid\Config\Env, array)
#3 /includes/Rest/Router.php(315): MWParsoid\Rest\Handler\PageHandler->execute()
#4 /includes/Rest/Router.php(285): MediaWiki\Rest\Router->executeHandler(MWParsoid\Rest\Handler\PageHandler)
#5 /includes/Rest/EntryPoint.php(116): MediaWiki\Rest\Router->execute(MediaWiki\Rest\RequestFromGlobals)
#6 /includes/Rest/EntryPoint.php(83): MediaWiki\Rest\EntryPoint->execute()
#7 /rest.php(31): MediaWiki\Rest\EntryPoint::main()
#8 /srv/mediawiki/w/rest.php(3): require(string)
#9 {main}

Event Timeline

ssastry created this task.Oct 30 2019, 3:26 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 30 2019, 3:26 AM
ssastry triaged this task as Medium priority.Oct 30 2019, 3:27 AM
Arlolra claimed this task.Nov 15 2019, 7:10 PM

If I add JSON_THROW_ON_ERROR to PHPUtils::jsonEncode, we get when trying to stringify the pagebundle,

src/Utils/PHPUtils.php: Malformed UTF-8 characters, possibly incorrectly encoded

An isolated test case is [[|[[파일:인스타그램 아이콘.png|width=24]]]]

cscott added a subscriber: cscott.EditedNov 15 2019, 8:40 PM

That test case is interesting:

$ echo '[[|[[파일:인스타그램 아이콘.png|width=24]]]]' | php bin/parse.php  --domain --body_only
<p data-parsoid='{"dsr":[0,93,0,0]}'>[<a rel="mw:ExtLink" href="파일:인스타그램" class="external text" data-parsoid="">아이콘.png</a>]</p>
$ echo '[[|[[파일:인스타그램 아이콘.png|width=24]]]]' | bin/parse.js  --domain --body_only
<p data-parsoid='{"dsr":[0,73,0,0]}'>[<a rel="mw:ExtLink" href="파일:인스타그램" class="external text" data-parsoid='{"a":{"href":"파일:인스타그램"},"sa":{"href":"|[[파일:인스타"},"dsr":[1,59,50,1]}'>아이콘.png</a>]</p>

From legacy parser, with $wgLanguageCode='ko':

$ echo '[[|[[파일:인스타그램 아이콘.png|width=24]]]]' | php maintenance/parse.php 
parse.php: warning: reading wikitext from STDIN. Press CTRL+D to parse.

<p>[<a rel="nofollow" class="external text" href=""></a><a href="/~cananian/mediawiki/index.php?title=%ED%8A%B9%EC%88%98:%EC%98%AC%EB%A6%AC%EA%B8%B0&amp;wpDestFile=%EC%9D%B8%EC%8A%A4%ED%83%80%EA%B7%B8%EB%9E%A8_%EC%95%84%EC%9D%B4%EC%BD%98.png" class="new" title="파일:인스타그램 아이콘.png">width=24</a>]

So data-parsoid in Parsoid/PHP is being omitted, presumably due to a exception during JSON serialization which @Arlolra found -- but the root cause is because the data-parsoid sa property is being inappropriately truncated -- the a property seems to be correct.

Both of them differ from the legacy parser, but that might be because it appears that just setting $wgLanguageCode on my localhost is not enough to get it to recognize the localized namespace? Could be something else going on, too...

// NOTE: Tokenizing this as src seems little suspect


And, indeed, it is.

You should be able to use the TSR to get the appropriate region of the original wikitext and re-tokenize that, instead of trying to reconstruct it.

Change 551272 had a related patch set uploaded (by Arlolra; owner: Arlolra):
[mediawiki/services/parsoid@master] [WIP] Ko

Change 551272 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Use frame source instead of stringifying tokens

Arlolra closed this task as Resolved.Nov 15 2019, 11:08 PM