When testing T250937, I noticed that <meta typeof="mw:Includes/..."> nodes seem to be always serialized to wikitext with newlines around them, even if there weren't any in the original wikitext.
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Bump Parsoid to v0.12.0-a15 | mediawiki/vendor | master | +1 K -641 | |
<*include*> tags don't need newlines before/after | mediawiki/services/parsoid | master | +6 -6 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | matmarex | T250937 Noinclude with the VisualEditor | |||
Resolved | ssastry | T253703 <meta typeof="mw:Includes/..."> nodes are always serialized to wikitext with newlines around them |
Event Timeline
Minimal test case:
Input HTML:
<p>a<meta typeof="mw:Includes/NoInclude">b<meta typeof="mw:Includes/NoInclude/End">c</p>
Expected output wikitext:
a<noinclude>b</noinclude>c
Actual output wikitext:
a <noinclude> b </noinclude> c
Change 598888 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] WIP: <*include*> tags don't need newlines before/after
Change 598888 merged by jenkins-bot:
[mediawiki/services/parsoid@master] <*include*> tags don't need newlines before/after
Even with this patch, I still see the issue locally when editing pages with VisualEditor, am I misunderstanding something?
Change 601428 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/vendor@master] Bump Parsoid to v0.12.0-a15
Change 601428 merged by jenkins-bot:
[mediawiki/vendor@master] Bump Parsoid to v0.12.0-a15
I imagine VE is sending along newlines and Parsoid uses those. In html->wt code path, Parsoid's constraints sets up min/max newlines ... or none => the newlines in HTML are transferred over.See line 123 in https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/598888/4/src/Html2Wt/DOMHandlers/MetaHandler.php
[subbu@earth:~/work/wmf/parsoid] echo '<p>a<meta typeof="mw:Includes/NoInclude">b<meta typeof="mw:Includes/NoInclude/End">c</p>' | php bin/parse.php --html2wt a<noinclude>b</noinclude>c [subbu@earth:~/work/wmf/parsoid] echo -e '<p>a\n<meta typeof="mw:Includes/NoInclude">\nb\n<meta typeof="mw:Includes/NoInclude/End">\nc</p>' | php bin/parse.php --html2wt a <noinclude> b </noinclude> c
But, if we just want to drop all newlines from the HTML around noincludes, I could set constraints to force newlines to zero always if that is what is required.
I think I must have been testing with the wrong version of Parsoid, as I can't reproduce the problem I was seeing any more.
It is possible ... But, note that as my commandline reproduction showed, this issue exists in Parsoid and the output depends on what HTML editing clients like VE send along. If we need to forcibly suppress those newlines always, we can. But, I am not certain if that is necessary.