Page MenuHomePhabricator

Parsoid incorporates text after a balanced template into the template if immediately preceded by an image
Open, Needs TriagePublicBUG REPORT

Description

The following wikitext renders correctly in Parsoid as a <p> tag with two children: a span with template metadata for the template transclusion, and a text node afterwards

{{1x|1=<span></span>}}Content afterwards that should be independent

However, adding an image right before causes the paragraph to be treated as a part of the template:

[[File:Image.svg|thumb]]{{1x|1=<span></span>}}Content afterwards that should be independent

(see the HTML Parsoid gives at https://en.wikipedia.org/w/rest.php/v1/page/User:Vahurzpu%2FSandbox_for_debugging_Parsoid/html)

Seen in the wild by another user who was attempting to edit the first paragraph of https://en.wikipedia.org/wiki/Pelican_Lake_Indian_Residential_School, which VisualEditor thinks is entirely a child of {{Coord missing}}

Event Timeline

This is an edge case introduced by Parsoid's trying to be bug-compatible with T134469.

$ echo '{{1x|1=<span></span>}}Content afterwards that should be independent' | php bin/parse.php --trace html
0-[HTML]       | {"type":"TagTk","name":"p","attribs":[],"dataParsoid":{"tmp":{"tagId":1,"bits":0}}}
0-[HTML]       | {"type":"SelfclosingTagTk","name":"meta","attribs":[{"k":"typeof","v":"mw:Transclusion"},{"k":"about","v":"#mwt1"}],"dataParsoid":{"tmp":{"tagId":2,"bits":0,"tplarginfo":{"targetWt":"1x","func":null,"href":"./Template:1x","paramInfos":{"1":{"k":"1","named":true}}}},"tsr":[0,22],"src":"{{1x|1=<span></span>}}"}}
0-[HTML]       | {"type":"TagTk","name":"span","attribs":[],"dataParsoid":{"tmp":{"tagId":3,"bits":256},"stx":"html"}}
0-[HTML]       | {"type":"EndTagTk","name":"span","attribs":[],"dataParsoid":{"tmp":{"tagId":null,"bits":256},"stx":"html"}}
0-[HTML]       | {"type":"SelfclosingTagTk","name":"meta","attribs":[{"k":"typeof","v":"mw:Transclusion/End"},{"k":"about","v":"#mwt1"}],"dataParsoid":{"tmp":{"tagId":4,"bits":256},"tsr":[null,22]}}
0-[HTML]       | "Content afterwards that should be independent"
0-[HTML]       | {"type":"EndTagTk","name":"p","attribs":[],"dataParsoid":{"tmp":{"tagId":null,"bits":0}}}
0-[HTML]       | {"type":"NlTk","dataParsoid":{"tmp":{"tagId":null,"bits":0},"tsr":[67,68]}}
0-[HTML]       | {"type":"EOFTk"}

P-wrapping has already happened before the DOM is built -- you see the p-tags there.

$ echo '[[File:Image.svg|thumb]]{{1x|1=<span></span>}}Content afterwards that should be independent' | php bin/parse.php --trace html  --dump dom:post-pwrap
0-[HTML]       | {"type":"TagTk","name":"figure","attribs":[{"k":"class","v":"mw-default-size"},{"k":"typeof","v":"mw:File/Thumb"}],"dataParsoid":{"tmp":{"tagId":1,"bits":0},"tsr":[0,24],"optList":[{"ck":"thumbnail","ak":"thumb"}]}}
0-[HTML]       | {"type":"TagTk","name":"a","attribs":[{"k":"href","v":"./Special:FilePath/Image.svg"}],"dataParsoid":{"tmp":{"tagId":2,"bits":0}}}
0-[HTML]       | {"type":"TagTk","name":"span","attribs":[{"k":"class","v":"mw-file-element mw-broken-media"},{"k":"resource","v":"./File:Image.svg"},{"k":"data-width","v":"220"}],"dataParsoid":{"tmp":{"tagId":3,"bits":0},"a":{"resource":"./File:Image.svg"},"sa":{"resource":"File:Image.svg"}}}
0-[HTML]       | "File:Image.svg"
0-[HTML]       | {"type":"EndTagTk","name":"span","attribs":[],"dataParsoid":{"tmp":{"tagId":null,"bits":0}}}
0-[HTML]       | {"type":"EndTagTk","name":"a","attribs":[],"dataParsoid":{"tmp":{"tagId":null,"bits":0}}}
0-[HTML]       | {"type":"TagTk","name":"figcaption","attribs":[],"dataParsoid":{"tmp":{"tagId":4,"bits":0},"tsr":null}}
0-[HTML]       | {"type":"EndTagTk","name":"figcaption","attribs":[],"dataParsoid":{"tmp":{"tagId":null,"bits":0}}}
0-[HTML]       | {"type":"EndTagTk","name":"figure","attribs":[],"dataParsoid":{"tmp":{"tagId":null,"bits":0}}}
0-[HTML]       | {"type":"SelfclosingTagTk","name":"meta","attribs":[{"k":"typeof","v":"mw:Transclusion"},{"k":"about","v":"#mwt1"}],"dataParsoid":{"tmp":{"tagId":5,"bits":0,"tplarginfo":{"targetWt":"1x","func":null,"href":"./Template:1x","paramInfos":{"1":{"k":"1","named":true}}}},"tsr":[24,46],"src":"{{1x|1=<span></span>}}"}}
0-[HTML]       | {"type":"TagTk","name":"span","attribs":[],"dataParsoid":{"tmp":{"tagId":6,"bits":256},"stx":"html"}}
0-[HTML]       | {"type":"EndTagTk","name":"span","attribs":[],"dataParsoid":{"tmp":{"tagId":null,"bits":256},"stx":"html"}}
0-[HTML]       | {"type":"SelfclosingTagTk","name":"meta","attribs":[{"k":"typeof","v":"mw:Transclusion/End"},{"k":"about","v":"#mwt1"}],"dataParsoid":{"tmp":{"tagId":7,"bits":256},"tsr":[null,46]}}
0-[HTML]       | "Content afterwards that should be independent"
0-[HTML]       | {"type":"NlTk","dataParsoid":{"tmp":{"tagId":null,"bits":0},"tsr":[91,92]}}
0-[HTML]       | {"type":"EOFTk"}

[dump] ----- DOM: post-pwrap -----
<body><figure class="mw-default-size" typeof="mw:File/Thumb" data-parsoid='{"tmp":{"tagId":1,"bits":0},"tsr":[0,24],"optList":[{"ck":"thumbnail","ak":"thumb"}]}'><a href="./Special:FilePath/Image.svg" data-parsoid='{"tmp":{"tagId":2,"bits":0}}'><span class="mw-file-element mw-broken-media" resource="./File:Image.svg" data-width="220" data-parsoid='{"tmp":{"tagId":3,"bits":0},"a":{"resource":"./File:Image.svg"},"sa":{"resource":"File:Image.svg"}}'>File:Image.svg</span></a><figcaption data-parsoid='{"tmp":{"tagId":4,"bits":0},"tsr":null}'></figcaption></figure><meta typeof="mw:Transclusion" about="#mwt1" data-parsoid='{"tmp":{"tagId":5,"bits":0,"tplarginfo":{"targetWt":"1x","func":null,"href":"./Template:1x","paramInfos":{"1":{"k":"1","named":true}}}},"tsr":[24,46],"src":"{{1x|1=&lt;span>&lt;/span>}}"}'/><p><span data-parsoid='{"tmp":{"tagId":6,"bits":256},"stx":"html"}'></span><meta typeof="mw:Transclusion/End" about="#mwt1" data-parsoid='{"tmp":{"tagId":7,"bits":256},"tsr":[null,46]}'/>Content afterwards that should be independent
</p></body>

This doesn't have p-wrapping done before DOM is built (see missing p-tags). So, the PWrap DOM pass kicks in after the fact to fix things up and isn't as careful avoiding possible extensions of template boundaries and instead of having the paragraph wrap the template-start meta tag, it leaves it out (see dom dump post pwrap), and that causes downstream effects.

Low priority bug, but shouldn't be too hard to fix up.