We seem to have a number of slightly-different sanitizers floating around:
- There's Sanitizer.php in core
- There's a almost-identical-but-not-quite version of this in Parsoid as well ( T247804: Move Sanitizer from core into Parsoid will tackle this }
- Visual Editor uses DOMPurify for something -- paste fix up stuff?
- Machine Translation uses DOMPurify: https://gerrit.wikimedia.org/r/363156
- And the mobile apps team wants to use DOMPurify as well: https://github.com/fgnass/domino/issues/102
I'd like to better understand the different use cases here and try to come up with one or two implementations we can all agree on, assuming that we're all trying to do the same thing. I'd like to avoid divergence and corner case bugs where some sanitizers let things through which others don't, which could even result in security issues in the worst case scenarios. Further, we periodically allow new 'safe' attributes through into wikitext, like T247910: MediaWiki should allow setting tabindex="0" on elements in wikitext; we need to determine what the appropriate mechanisms are to keep the various sanitizers in sync.