Check output of https://en.wikipedia.org/wiki/Hampi_(town)?useparsoid=1 for example and we see inline data-parsoid there. It is not being stripped when pagebundle format is requested.
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T342464 inline data-parsoid found in indicator HTML | |||
Open | cscott | T348161 Parsoid Rich Attributes phase 1b |
Event Timeline
As it turns out, this is a known issue. See this FIXME which I added in an earlier patch. To deal with this automatically, I had https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/932037 but @Arlolra found that objectionable.
So, we will either need to resolve that argument OR we need to write custom traversal code for saveDataParsoid
This would be fixed by the rich attributes patch: T339927: Rich Attribute Support in Parsoid. By keeping the document fragment in the attribute live, the data-parsoid for the embedded content is handled the same way all the other data-parsoid is handled.
This also applies to LanguageConverter markup, which has similar inline data-parsoid. It might apply to gallery captions as well, although Arlo might have independently hacked around that.