Page MenuHomePhabricator

Parsoid's Fragment Mode support doesn't process strip markers recursively inside StripMarker::split
Closed, ResolvedPublicBUG REPORT

Description

Not totally certain whether this is an issue with Cite('s Parsoid implementation) or with Parsoid itself, but see this discussion.

Steps to replicate the issue (include links if applicable):

What happens?:

image.png (295×931 px, 32 KB)

What should have happened instead?:
https://en.wikipedia.org/w/index.php?title=2024_Men%27s_T20_World_Cup_Super_8_stage&oldid=1278076374&useparsoid=0#Teams :

image.png (230×829 px, 24 KB)

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):

This concerns wikitext which is basically something of the form {{#tag:ref|Text<ref>Name</ref>}}. In this case it's going through {{efn}} which has a module underneath.

Related Objects

Event Timeline

Confirmed on a test server that this patch is the cause of that -- we'll investigate and fix OR revert that patch on Monday.

I can reproduce this locally with this wikitext:

{{#tag:poem|Text<ref>Name</ref>}}

{{#tag:poem|Text<syntaxhighlight lang='php'>$a=1</syntaxhighlight>}}

{{#tag:indicator|<templatestyles src="GeoData/styles.css" />''foo''|name=bar}}

If I set $wgParsoidFragmentSupport = 'v2'; in my LocalSettings.php, I get the HTML output:

<div class="poem" typeof="mw:Extension/poem mw:Transclusion" about="#mwt2" id="mwAg" data-mw="{&quot;name&quot;:&quot;poem&quot;,&quot;attrs&quot;:{},&quot;body&quot;:{&quot;extsrc&quot;:&quot;Text'\&quot;`UNIQ--ref-00000000-QINU`\&quot;'&quot;},&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;#tag:poem&quot;,&quot;function&quot;:&quot;tag&quot;},&quot;params&quot;:{&quot;1&quot;:{&quot;wt&quot;:&quot;Text<ref>Name</ref>&quot;}},&quot;i&quot;:0}}]}"><p id="mwAw">Text'"`UNIQ--ref-00000000-QINU`"'</p></div>

<div class="poem" typeof="mw:Extension/poem mw:Transclusion" about="#mwt6" id="mwBA" data-mw="{&quot;name&quot;:&quot;poem&quot;,&quot;attrs&quot;:{},&quot;body&quot;:{&quot;extsrc&quot;:&quot;Text'\&quot;`UNIQ--syntaxhighlight-00000002-QINU`\&quot;'&quot;},&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;#tag:poem&quot;,&quot;function&quot;:&quot;tag&quot;},&quot;params&quot;:{&quot;1&quot;:{&quot;wt&quot;:&quot;Text<syntaxhighlight lang='php'>$a=1</syntaxhighlight>&quot;}},&quot;i&quot;:0}}]}"><p id="mwBQ">Text'"`UNIQ--syntaxhighlight-00000002-QINU`"'</p></div>

<meta typeof="mw:Extension/indicator mw:Transclusion" about="#mwt10" id="mwBg" data-mw="{&quot;name&quot;:&quot;indicator&quot;,&quot;attrs&quot;:{&quot;name&quot;:&quot;bar&quot;},&quot;body&quot;:{&quot;extsrc&quot;:&quot;'\&quot;`UNIQ--templatestyles-00000004-QINU`\&quot;'''foo''&quot;},&quot;html&quot;:&quot;'\&quot;`UNIQ--templatestyles-00000004-QINU`\&quot;'<i data-parsoid=\&quot;{}\&quot;>foo</i>&quot;,&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;#tag:indicator&quot;,&quot;function&quot;:&quot;tag&quot;},&quot;params&quot;:{&quot;1&quot;:{&quot;wt&quot;:&quot;<templatestyles src=\&quot;GeoData/styles.css\&quot; />''foo''&quot;},&quot;name&quot;:{&quot;wt&quot;:&quot;bar&quot;}},&quot;i&quot;:0}}]}">

I think the fragment support code probably has some missing support for the #tag parser function.

This is a bug in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1088365/8/includes/parser/StripState.php ... which doesn't recursively process strip markers similar to what StripState::unstripType does.

Change #1123779 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/core@master] WIP: Process strip markers recursively in split

https://gerrit.wikimedia.org/r/1123779

Change #1123815 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[operations/mediawiki-config@master] Revert "Turn on Parsoid fragment support everywhere"

https://gerrit.wikimedia.org/r/1123815

Change #1123815 merged by jenkins-bot:

[operations/mediawiki-config@master] Revert "Turn on Parsoid fragment support everywhere"

https://gerrit.wikimedia.org/r/1123815

Mentioned in SAL (#wikimedia-operations) [2025-03-03T14:42:57Z] <ihurbain@deploy2002> Started scap sync-world: Backport for [[gerrit:1123495|Change license for Russian Wikinews to CC-BY-4.0 (T387279)]], [[gerrit:1123815|Revert "Turn on Parsoid fragment support everywhere" (T387608)]]

Mentioned in SAL (#wikimedia-operations) [2025-03-03T14:46:43Z] <ihurbain@deploy2002> matmarex, ssastry, ihurbain: Backport for [[gerrit:1123495|Change license for Russian Wikinews to CC-BY-4.0 (T387279)]], [[gerrit:1123815|Revert "Turn on Parsoid fragment support everywhere" (T387608)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-03-03T14:54:37Z] <ihurbain@deploy2002> Finished scap sync-world: Backport for [[gerrit:1123495|Change license for Russian Wikinews to CC-BY-4.0 (T387279)]], [[gerrit:1123815|Revert "Turn on Parsoid fragment support everywhere" (T387608)]] (duration: 11m 39s)

A similar issue was also reported at https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#List-defined_references_not_working_on_Android_app.

The article rendered like this in Parsoid at the time:

Screenshot 2025-03-03 at 16-29-39 Apollonius quadrilateral - Wikipedia.png (1×1 px, 205 KB)

Now it looks fixed:

Screenshot 2025-03-03 at 16-29-45 Apollonius quadrilateral - Wikipedia.png (1×1 px, 214 KB)

ssastry renamed this task from Refs inside {{efn}} are now outputting strip markers in Parsoid to Parsoid's Fragment Mode support doesn't process strip markers recursively inside StripMarker::split.Mar 3 2025, 6:55 PM
ssastry triaged this task as High priority.
ssastry updated Other Assignee, added: ssastry.

Change #1127076 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/core@wmf/1.44.0-wmf.20] Process strip markers recursively in split

https://gerrit.wikimedia.org/r/1127076

Change #1127076 abandoned by Subramanya Sastry:

[mediawiki/core@wmf/1.44.0-wmf.20] Process strip markers recursively in split

Reason:

The updated patch in master is now failing CI -- need to address that first.

https://gerrit.wikimedia.org/r/1127076

Change #1123779 merged by jenkins-bot:

[mediawiki/core@master] Fixes to "Parsoid Fragment Support v2"

https://gerrit.wikimedia.org/r/1123779

Change #1127076 restored by Jforrester:

[mediawiki/core@wmf/1.44.0-wmf.20] Process strip markers recursively in split

https://gerrit.wikimedia.org/r/1127076

Change #1127076 abandoned by Subramanya Sastry:

[mediawiki/core@wmf/1.44.0-wmf.20] Fixes to "Parsoid Fragment Support v2"

Reason:

We won't backport this. https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1127630 is also needed. So, we will just let them both ride the train. Then on Thursday, run rt-testing in Parsoid with this enabled, and then turn on v2 fragment mode the following week. So, we'll lose a week, but this has given us enough trouble that it is better to do a bit more thorough testing.

https://gerrit.wikimedia.org/r/1127076

Change #1130343 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[operations/mediawiki-config@master] Turn on Parsoid fragment support everywhere (take 2)

https://gerrit.wikimedia.org/r/1130343

Change #1130343 merged by jenkins-bot:

[operations/mediawiki-config@master] Turn on Parsoid fragment support everywhere (take 2)

https://gerrit.wikimedia.org/r/1130343

Mentioned in SAL (#wikimedia-operations) [2025-03-24T13:33:27Z] <tgr@deploy1003> Started scap sync-world: Backport for [[gerrit:1130343|Turn on Parsoid fragment support everywhere (take 2) (T374661 T380758 T389545 T387608)]], [[gerrit:1130560|Do not throw an exception after shared-domain login with no token (T362715)]], [[gerrit:1130561|Do not start central login from the shared domain (T362715)]]

Mentioned in SAL (#wikimedia-operations) [2025-03-24T13:37:04Z] <tgr@deploy1003> tgr, cscott: Backport for [[gerrit:1130343|Turn on Parsoid fragment support everywhere (take 2) (T374661 T380758 T389545 T387608)]], [[gerrit:1130560|Do not throw an exception after shared-domain login with no token (T362715)]], [[gerrit:1130561|Do not start central login from the shared domain (T362715)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-03-24T13:54:09Z] <tgr@deploy1003> Finished scap sync-world: Backport for [[gerrit:1130343|Turn on Parsoid fragment support everywhere (take 2) (T374661 T380758 T389545 T387608)]], [[gerrit:1130560|Do not throw an exception after shared-domain login with no token (T362715)]], [[gerrit:1130561|Do not start central login from the shared domain (T362715)]] (duration: 20m 42s)

Change #1135499 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/core@master] Avoid strip markers in HTML when using {{#tag}}

https://gerrit.wikimedia.org/r/1135499

Change #1135499 merged by jenkins-bot:

[mediawiki/core@master] Prevent strip markers in HTML when using {{#tag}}

https://gerrit.wikimedia.org/r/1135499