Page MenuHomePhabricator

Templated lists swallow following category links
Open, MediumPublic

Description

[subbu@earth:~/work/wmf/parsoid] cat /tmp/wt
{{1x|*x}}




[[Category:Bar]]

[subbu@earth:~/work/wmf/parsoid] php bin/parse.php  < /tmp/wt 
<span about="#mwt1" typeof="mw:Transclusion" data-parsoid='{"pi":[[{"k":"1"}]],"dsr":[0,30,null,null]}' data-mw='{"parts":[{"template":{"target":{"wt":"1x","href":"./Template:1x"},"params":{"1":{"wt":"*x"}},"i":0}},"\n\n\n\n\n[[Category:Bar]]"]}'>
</span><ul about="#mwt1"><li>x




<link rel="mw:PageProp/Category" href="./Category:Bar"/></li></ul>

If you rerun it with --trace html, you will see that the category link has been swallowed into the list item already. So, this is likely a list token handler bug.


Original Bug Report below:
On this page: https://ko.wikipedia.org/wiki/위키백과:질문방?veaction=edit&uselang=en, the big transclusion covering most of the page unexpectedly leaks to the content following it (a category link).

image.png (2×3 px, 606 KB)

I can't figure out why (the page doesn't seem to have any unclosed tags), it might be a bug?

Event Timeline

ssastry renamed this task from Transclusion unexpectedly leaks to the following content to Templated lists swallow following category links.Aug 10 2020, 11:09 PM
ssastry triaged this task as Medium priority.
ssastry updated the task description. (Show Details)
ssastry updated the task description. (Show Details)
ssastry moved this task from Needs Triage to Bugs & Crashers on the Parsoid board.

Change 673289 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] WIP: Handle mw:EmptyLine tokens as NlTk in ListHandler

https://gerrit.wikimedia.org/r/673289

Change 673289 abandoned by Subramanya Sastry:
[mediawiki/services/parsoid@master] WIP: Handle mw:EmptyLine tokens as NlTk in ListHandler

Reason:
Meh .. looks like some other tests explicitly depend on current behavior because of how categories and newlines are handled. I thought this was a "quick fix" but looks like not. Since I won't be able to get this soon, abandoning.

https://gerrit.wikimedia.org/r/673289

ssastry added a subscriber: ssastry.

Meh .. looks like some other tests explicitly depend on current behavior because of how categories and newlines are handled. I thought this was a "quick fix" but looks like not. Since I won't be able to get this soon, abandoning WIP and unassigning.

https://github.com/wikimedia/parsoid/blob/master/src/Wt2Html/PP/Handlers/LiFixups.php#L155-L179

	 * Earlier in the parsing pipeline, we suppress all newlines
	 * and other whitespace before categories which causes category
	 * links to be swallowed into preceding paragraphs and list items.

However, migrateTrailingCategories returns early when the list item is templated, since the newlines are already captured in the data-mw parts and there's a risk of migrating a category out of template boundary and introducing the dirty diffs we're trying to prevent.