Page MenuHomePhabricator

Ensure <meta typeof="..."> in Parser/Parsoid HTML can't be spoofed from wikitext
Open, MediumPublic

Description

The core sanitizer generally allows <meta> tags in wikitext as long as they have content and itemprop attributes. That means that the following might make it intact from wikitext into HTML, and then confuse the Parsoid html2wt code:

<meta typeof="mw:Annotation/translate" itemprop="foo" content="bar">
<meta property="mw:PageProp/toc" itemprop="foo" content="bar">

Parsoid contains code in the tokenizer to remap typeof attributes, but it's not clear that the new Annotation code uses those pathways. The page property metas don't use typeof at all, which impacts ToC spoofing. And of course, none of the Parsoid remapping, done in Parsoid's copy of the Sanitizer, is done in core's copy of the Sanitizer (yet): T248211: One Sanitizer to Rule Them All/T247804: Move Sanitizer from core into Parsoid

The MediaWiki DOM Spec briefly mentions "User-supplied RDFa with the mw prefix is moved to a non-clashing prefix in Parsoid." but I don't think we document anywhere (except in code) exactly how that mapping is done.

So this task is to:

  • decide on a uniform attribute sanitization/remapping process to ensure that Parsoid <meta> tags aren't spoofable from wikitext content, while allowing wikitext content maximum flexibility for authoring non-conflicting <meta> tags (see T48826: Sanitizer breaks microdata).
  • implement it both in the core Sanitizer and Parsoid (or in the T248211: One Sanitizer to Rule Them All)
  • document this remapping clearly in the MediaWiki DOM Spec
  • add test cases to show that spoofing isn't possible and protect against future regressions

See comments on https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/702996 for some places which likely need attention.

Related Objects

StatusSubtypeAssignedTask
OpenReleaseNone
OpenNone
OpenNone
OpenNone
OpenFeatureNone
OpenNone
OpenNone
OpenNone
Resolvedssastry
OpenNone
OpenNone
OpenNone
OpenNone
Resolvedcscott
OpenNone
Resolvedovasileva
Resolvedssastry
OpenNone
Resolvedcscott
Resolvedmatmarex
OpenNone
OpenBUG REPORTNone
OpenNone
Opencscott