Page MenuHomePhabricator

Failed invariant in Linter
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error
normalized_message
[{reqId}] {exception_url}   Wikimedia\Assert\InvariantException: Invariant failed: Expected an element
exception.trace
from /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/assert/src/Assert.php(231)
#0 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Utils/DOMUtils.php(136): Wikimedia\Assert\Assert::invariant(boolean, string)
#1 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Ext/DOMUtils.php(64): Wikimedia\Parsoid\Utils\DOMUtils::assertElt(NULL)
#2 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Ext/Cite/Ref.php(73): Wikimedia\Parsoid\Ext\DOMUtils::assertElt(NULL)
#3 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/Linter.php(1321): Wikimedia\Parsoid\Ext\Cite\Ref->lintHandler(Wikimedia\Parsoid\Ext\ParsoidExtensionAPI, Wikimedia\Parsoid\DOM\Element, Closure)
#4 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/Linter.php(1334): Wikimedia\Parsoid\Wt2Html\PP\Processors\Linter->findLints(Wikimedia\Parsoid\DOM\Element, Wikimedia\Parsoid\Config\Env, stdClass)
#5 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/Linter.php(1334): Wikimedia\Parsoid\Wt2Html\PP\Processors\Linter->findLints(Wikimedia\Parsoid\DOM\Element, Wikimedia\Parsoid\Config\Env, stdClass)
#6 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/Linter.php(1334): Wikimedia\Parsoid\Wt2Html\PP\Processors\Linter->findLints(Wikimedia\Parsoid\DOM\Element, Wikimedia\Parsoid\Config\Env, stdClass)
#7 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/Linter.php(1334): Wikimedia\Parsoid\Wt2Html\PP\Processors\Linter->findLints(Wikimedia\Parsoid\DOM\Element, Wikimedia\Parsoid\Config\Env, stdClass)
#8 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/Linter.php(1334): Wikimedia\Parsoid\Wt2Html\PP\Processors\Linter->findLints(Wikimedia\Parsoid\DOM\Element, Wikimedia\Parsoid\Config\Env, stdClass)
#9 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/Linter.php(1334): Wikimedia\Parsoid\Wt2Html\PP\Processors\Linter->findLints(Wikimedia\Parsoid\DOM\Element, Wikimedia\Parsoid\Config\Env, stdClass)
#10 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/Linter.php(1334): Wikimedia\Parsoid\Wt2Html\PP\Processors\Linter->findLints(Wikimedia\Parsoid\DOM\Element, Wikimedia\Parsoid\Config\Env, stdClass)
#11 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/PP/Processors/Linter.php(1368): Wikimedia\Parsoid\Wt2Html\PP\Processors\Linter->findLints(Wikimedia\Parsoid\DOM\Element, Wikimedia\Parsoid\Config\Env)
#12 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(159): Wikimedia\Parsoid\Wt2Html\PP\Processors\Linter->run(Wikimedia\Parsoid\Config\Env, Wikimedia\Parsoid\DOM\Element, array, boolean)
#13 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(986): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->Wikimedia\Parsoid\Wt2Html\{closure}(Wikimedia\Parsoid\DOM\Element, array, boolean)
#14 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(1027): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->doPostProcess(Wikimedia\Parsoid\DOM\Element)
#15 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/DOMPostProcessor.php(1045): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->process(Wikimedia\Parsoid\DOM\Element)
#16 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/ParserPipeline.php(180): Wikimedia\Parsoid\Wt2Html\DOMPostProcessor->processChunkily(string, array)
#17 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Wt2Html/ParserPipelineFactory.php(308): Wikimedia\Parsoid\Wt2Html\ParserPipeline->parseChunkily(string, array)
#18 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Core/WikitextContentModelHandler.php(105): Wikimedia\Parsoid\Wt2Html\ParserPipelineFactory->parse(string)
#19 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Parsoid.php(166): Wikimedia\Parsoid\Core\WikitextContentModelHandler->toDOM(Wikimedia\Parsoid\Config\Env)
#20 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/src/Parsoid.php(198): Wikimedia\Parsoid\Parsoid->parseWikitext(MWParsoid\Config\PageConfig, array)
#21 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/extension/src/Rest/Handler/ParsoidHandler.php(584): Wikimedia\Parsoid\Parsoid->wikitext2html(MWParsoid\Config\PageConfig, array, NULL)
#22 /srv/mediawiki/php-1.38.0-wmf.20/vendor/wikimedia/parsoid/extension/src/Rest/Handler/PageHandler.php(88): MWParsoid\Rest\Handler\ParsoidHandler->wt2html(MWParsoid\Config\PageConfig, array)
#23 /srv/mediawiki/php-1.38.0-wmf.20/includes/Rest/Router.php(414): MWParsoid\Rest\Handler\PageHandler->execute()
#24 /srv/mediawiki/php-1.38.0-wmf.20/includes/Rest/Router.php(338): MediaWiki\Rest\Router->executeHandler(MWParsoid\Rest\Handler\PageHandler)
#25 /srv/mediawiki/php-1.38.0-wmf.20/includes/Rest/EntryPoint.php(167): MediaWiki\Rest\Router->execute(MediaWiki\Rest\RequestFromGlobals)
#26 /srv/mediawiki/php-1.38.0-wmf.20/includes/Rest/EntryPoint.php(132): MediaWiki\Rest\EntryPoint->execute()
#27 /srv/mediawiki/php-1.38.0-wmf.20/rest.php(31): MediaWiki\Rest\EntryPoint::main()
#28 /srv/mediawiki/w/rest.php(3): require(string)
#29 {main}

Event Timeline

This is still present on f34bf1865f2d6df9d0878a7bdad656b12c237953

Smaller reproducer that doesn't take ages to parse (and that uses one less template):

{|
|-
|{{Herbarium of Baltimore Woods specimen
|species-name=Humulus lupulus
|author-name=L.
|var-name=lupulus<ref>Could possibly be ''Humulus lupulus'' var. ''lupuloides'' (native northeastern American hops), but that's less likely.</ref>
|common-name=European hops, common hop
|nativity=Introduced from Eurasia
|ntz=Cultivated, naturalized
|nwi=FACU
|fny-link=Rhamnaceae_…_Urticaceae#Humulus
|image-name=Humulus lupulus var. lupulus BW-1979-1001-0438.jpg
|orig-species=Humulus lupulus
|orig-author=L.
|orig-common=Common hop
|orig-habitat=Openings
|orig-date=Oct. 1, 1979
}}
|}

And reducing the "Herbarium of Baltimore Woods specimen" template to

bgcolor="#fadada"|[[link|{{{1}}}]]

is also enough to reproduce the issue.

The corresponding smaller wikitext is:

{|
|-
|{{Herbarium of Baltimore Woods specimen|aaa<ref>bbb</ref>}}
|}

However, while the larger example also leads to a crash when the linter is off, this smaller example does not.

Current understanding of the issue:

  • The issue is introduced during the fixups phase. Pre-fix-up, the [1] is contained in a <a> tag in a <sup> tag; post-fixup that link is extracted from the <sup> into a later span, and the <sup> stays empty, which is what the lintHandler is complaining about.
  • This happens during the reparseTemplatedAttributes method of the TableFixups, more specifically at https://gerrit.wikimedia.org/g/mediawiki/services/parsoid/+/4b1ae7dce10ad5928745a0b956abeabc5cdfebea/src/Wt2Html/PP/Handlers/TableFixups.php#429. We go through that path because we do happen to be in a template that does have attributes.
  • My current hypothesis is that something unexpected happens with the re-parsing of the content of the cell before setting the new innerHTML of the cell. To be continued: entering the Remex realm of debugging now ;)

It turns out that the <sup> tag containing the [1] is itself in a link - which is probably the initial source of the issue.

Since the spec https://html.spec.whatwg.org/multipage/text-level-semantics.html#the-a-element says that there shall be no "interactive" content inside a link, remex spitting the new link outside of the initial link when encountering nested links seems reasonable. Maybe there's a way to hoist the <sup> along with it, which would solve the issue 🤔

Change 768063 had a related patch set uploaded (by Isabelle Hurbain-Palatin; author: Isabelle Hurbain-Palatin):

[mediawiki/services/parsoid@master] WIP - fiddling with the case "reference within a link"

https://gerrit.wikimedia.org/r/768063

It turns out that the <sup> tag containing the [1] is itself in a link - which is probably the initial source of the issue.

Since the spec https://html.spec.whatwg.org/multipage/text-level-semantics.html#the-a-element says that there shall be no "interactive" content inside a link, remex spitting the new link outside of the initial link when encountering nested links seems reasonable. Maybe there's a way to hoist the <sup> along with it, which would solve the issue 🤔

Aha ... so, that is the real bug here then.

[subbu@earth:~/work/wmf/parsoid] echo "[[Foo|x<ref>y</ref>y]]" | php bin/parse.php
<p data-parsoid='{"dsr":[0,22,0,0]}'><a rel="mw:WikiLink" href="./Foo" title="Foo" class="mw-redirect" data-parsoid='{"stx":"piped","a":{"href":"./Foo"},"sa":{"href":"Foo"},"dsr":[0,22,6,2]}'>x<sup about="#mwt2" class="mw-ref reference" id="cite_ref-1" rel="dc:references" typeof="mw:Extension/ref" data-parsoid='{"dsr":[7,19,5,6]}' data-mw='{"name":"ref","attrs":{},"body":{"id":"mw-reference-text-cite_note-1"}}'><a href="./Main_Page#cite_note-1" style="counter-reset: mw-Ref 1;" data-parsoid="{}"><span class="mw-reflink-text" data-parsoid="{}">[1]</span></a></sup>y</a></p>

<div class="mw-references-wrap" typeof="mw:Extension/references" about="#mwt3" data-parsoid='{"dsr":[23,23,0,0]}' data-mw='{"name":"references","attrs":{},"autoGenerated":true}'><ol class="mw-references references" data-parsoid="{}"><li about="#cite_note-1" id="cite_note-1" data-parsoid="{}"><a href="./Main_Page#cite_ref-1" rel="mw:referencedBy" data-parsoid="{}"><span class="mw-linkback-text" data-parsoid="{}">↑ </span></a> <span id="mw-reference-text-cite_note-1" class="mw-reference-text" data-parsoid="{}">y</span></li></ol></div>

See how Parsoid blindly embeds an <a> tag in an <a> tag which will then get broken up in the user's browser. It just happens that in the table fixups case, it gets handled internally.

So, we have a lint category for links within links ... maybe this should be flagged there as a lint error ... plus unpackDOMFragments needs to recognize this as a link-in-link scenario.

ssastry triaged this task as Medium priority.Mar 9 2022, 1:04 AM

Change 787508 had a related patch set uploaded (by Isabelle Hurbain-Palatin; author: Isabelle Hurbain-Palatin):

[mediawiki/services/parsoid@master] WIP - Experiment with pipeline for references in links

https://gerrit.wikimedia.org/r/787508

Change 787550 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] Only lint content defined by a specific ref

https://gerrit.wikimedia.org/r/787550

Change 787550 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Only lint content defined by a specific ref

https://gerrit.wikimedia.org/r/787550

Change 787508 abandoned by Isabelle Hurbain-Palatin:

[mediawiki/services/parsoid@master] WIP - Experiment with pipeline for references in links

Reason:

https://gerrit.wikimedia.org/r/787508

Change 768063 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Hoisting references outside of links

https://gerrit.wikimedia.org/r/768063

Change 792236 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/vendor@master] Bump parsoid to 0.16.0-a8

https://gerrit.wikimedia.org/r/792236

Change 792236 merged by jenkins-bot:

[mediawiki/vendor@master] Bump parsoid to 0.16.0-a8

https://gerrit.wikimedia.org/r/792236