We have three passes (migrateTemplateMetas, migrateTrailingNLs, migrateTrailingCategories) that are effectively doing something identical .. moving rendering transparent content from beginning / end of some subset of nodes.
Since moving paragraph wrapping to the dom in https://gerrit.wikimedia.org/r/#/c/mediawiki/services/parsoid/+/443744/
we've had to make numerous adjustments to these passes that most likely apply to all three.
https://gerrit.wikimedia.org/r/#/c/mediawiki/services/parsoid/+/446985/
https://gerrit.wikimedia.org/r/#/c/mediawiki/services/parsoid/+/446208/
https://gerrit.wikimedia.org/r/#/c/mediawiki/services/parsoid/+/446322/
The latter of which probably needs another tweak for,
Seems like the call to DU.lastNonSepChild in Subbu's fix should also account for DU.isSolTransparentLink (or just categories).
Similar to the above.'uzwiki/Pablo Neruda',
The following is a bunch of notes from investigating new semantic failures after 443744.
This is another case where sol transparent node are being included at the end of a paragraph where previously
they were excluded. The old algorithm would search the line in reverse order for where to put the end tag and
it would skip this content. It's unclear whether we should try to replicate that. This particular instance was broken wt though,
https://de.wikipedia.org/w/index.php?title=Seetaktische_Lehrgruppe&type=revision&diff=179240127&oldid=165339287
https://sv.wikipedia.org/w/index.php?title=Ulrik_Plesner&type=revision&diff=43289184&oldid=32227738
https://it.wikipedia.org/w/index.php?title=Cattedrale_di_Metz&type=revision&diff=98534211&oldid=98272937
https://ko.wikipedia.org/w/index.php?title=%ED%99%8D%EC%88%98%EC%A0%84&type=revision&diff=21808118&oldid=21387860
https://ko.wikipedia.org/w/index.php?title=%EB%B0%9C%ED%95%B4_%EB%AC%B4%EC%99%95&type=revision&diff=21808146&oldid=21308398
https://nl.wikipedia.org/w/index.php?title=Good_hearted_woman_(lied)&type=revision&diff=51968655&oldid=46721681&diffmode=source'dewiki/Seetaktische_Lehrgruppe',
'svwiki/Ulrik_Plesner',
'itwiki/Cattedrale di Metz',
'kowiki/홍수전',
'kowiki/발해 무왕',
'nlwiki/Good hearted woman (lied)',
Another case where the quick semantic check now fails, this time because of trailing space included in data-mw parts.
These things roundtrip and render the same though. The semantic diff it points out is a rendering failure that's
filed as https://phabricator.wikimedia.org/T199845'frwiki/Union des démocrates et indépendants'
This case is an argument for having a migrate leading (as opposed to traililng) newlines. For example,
test 123<div>hi</div>Here the leading newlines will be included in the paragraph with "123". Of course, this only happens at the
edges between doBlockLevels and Remex p-wrapping. This is because the newlines are included in the text node.
Finding a way to keep the emitted nltks as separate textnodes and then moving the call to normalize the body before
pwrapping would also prevent it.'ruwiki/Союз Советских Социалистических Республик',