Page MenuHomePhabricator

Hooks needed in Sanitizer to control acceptable tags and attributes
Open, LowPublicFeature


Author: wilson.jim.r

There is currently no way in MediaWiki to alter the list of acceptable HTML tags or attributes thereto. This information is hard coded into Sanitizer.php, and loaded into static variables - making them even harder to get to and modify.

I propose new hooks to allow extensions to modify the lists of acceptable HTML tags and attributes, or at least provide alternate lists in certain circumstances.

Any solution would also have to be sensitive to the need for extensions to be able to conditionally allow or disallow certain tags or attributes based on the context of the parse. Meaning that it is not sufficient to merely provide a window into the initial whitelist creation, but then continue to store said lists statically (unless the static allocation will be revisited as a result of extension execution).

A tentative patch will be supplied shortly.

Version: unspecified
Severity: enhancement



Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:52 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz10180.
bzimport added a subscriber: Unknown Object (MLST).

wilson.jim.r wrote:

first-draft at patch to satisfy requirements


ayg wrote:

Well, statically modifying the list would be a simple solution to a very common request (how do I allow tag X?). It might be easiest to just make them globals, from that perspective, like other config options. But it's pretty ugly to encourage extensions to change around globals dynamically, whether they're in the form of actual globals or static variables, and your approach (a hook run once) does avoid that.

Alternatively, we could perhaps convert the Sanitizer into an actual class of objects, not just a namespace of functions, and so allow per-instance modification of the attributes via mutators. That would, of course, require some rearchitecturing and bunches of reverse-compatibility stuff.

wilson.jim.r wrote:

Thanks for taking a look, Simetrical.

Yeah - my first thought was also to create new globals, but then I realized that it wouldn't allow extensions to _easily_ modify the Sanitizer on a per-request basis.

Also, just a note, since my first hook passes the &$staticInitialized var as one of the hook args, it's possible to have an extension re-set this to false so that the hook gets run every time the Sanitizer is invoked (allowing the same or other extensions to apply different "what tag is allowed" in a contextual manner).

Reforamtting the Sanitizer may be a good idea - it would certainly help if more information about the Parser or ParserOptions were available to the Sanitizer at runtime (that way there'd be less hackery necessary to determine the "context" of the parse). However, as you note, this is much more work and requires backwards compatibility tests.

Soroush83 wrote:

(In reply to comment #1)

Created an attachment (id=3730) [details]
first-draft at patch to satisfy requirements

Looks nicer than global variables for html tags and attributes. I think such change in sanitizer.php is required.

theevilipaddress wrote:

I can completely double the request. I wanted to create some very simple like:

$htmlpairsStatic[] = 'address';
$whitelist['address'] = $common;

to allow the <address> tag in MediaWiki, but this seems to be very difficult.

*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*

sumanah wrote:

Jim, I'm adding the "reviewed" keyword since you did get a review on your patch from Aryeh.

Do people still have the need that this patch would address, or have we addressed it in subsequent versions of MediaWiki?

Also, If you're interested in seeing the work that Gabriel and Brion are doing, rewriting the parser, check out the parser mailing list:

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:01 AM
Aklapper removed a subscriber: wikibugs-l-list.