Page MenuHomePhabricator

Removing inline CSS/JS from MediaWiki
Closed, DeclinedPublic

Description

One of the future security goals of MediaWiki is to implement Content Security Policy. This is an HTTP header that disallows inline JavaScript and CSS as well as scripts and styles from disallowed domains. One of the big steps to achieving this is to remove all inline CSS and JavaScript from MediaWiki HTML. Some of the places inline scripting/styling is used:

  • Inline styling in wikitext is translated to inline styling in HTML
  • ResourceLoader is mostly good, but the loader script (at the top and bottom of page) is inline JavaScript
  • Data such as user preferences and ResourceLoader config variables is embedded into the HTML as inline JSON, when it should be in HTML attributes
  • Many extensions use inline styling rather than ResourceLoader modules

Fixing all of these inline scripts and styles is too big a task for a single mentor program. However, working on one or two, and slowly chipping down on the inline JS and CSS can help to move closer toward the final goal. This project obviously requires, at the very least, basic HTML and JavaScript knowledge, but some parts are more difficult than others. For example, bullet points 2 and 3 require only basic MediaWiki knowledge, but bullet point 1 requires altering the Parser class, and thus mandates a deeper understanding of MediaWiki and how it parses wikitext.

Event Timeline

Niharika raised the priority of this task from to Needs Triage.
Niharika updated the task description. (Show Details)
Niharika added subscribers: Niharika, Aklapper.
  • Inline styling in wikitext is translated to inline styling in HTML

See also mw:Requests for comment/Deprecating inline styles and mw:Requests for comment/Allow styling in templates. This doesn't seem as simple as you're making it out to be.

I don't think this is suitable for a GSoC project.

Thank you for the quick feedback. The system works! This proposal was listed at https://www.mediawiki.org/wiki/Outreach_programs/Possible_projects#Removing_inline_CSS.2FJS_from_MediaWiki since... we don't know. Here it hasn't lasted more than a few hours.

I removed it from Possible-Tech-Projects.

I thought about suggesting to implement CSP some time ago and came to the conclusion that it will break lots of userscripts, but don't offer any improved security. The basic problem is: Which sources do you allow for scripts?

There are userscripts that have to call eval, so you have to allow unsafe-eval unless you want to break them.

There are also userscripts that fetch JSONP from various servers (e.g. I have a script that fetches data from viaf.org). In order to work correctly all these domains would have to be whitelisted.

On the other hand the basic idea behind CSP is, that you can trust the resources on your own server, which is just wrong on a publicly writable wiki: I could put the source of an evil script on some subpage in my userspace and call it via action=raw, and CSP just won't be able to prevent that.

So, even the most restrictive policy won't be able to block evil scripts, and even the most liberal policy will probably break perfectly valid userscripts.

The last time I compiled the numbers, over half of the xss vectors we had didn't allow arbitrary html injection, they were limited to adding event handlers. CSP would prevent all of those from being an issue. Along with blocking all of the times that (non-malicious, but naive) admins add facebook like buttons to their wiki.

For userscripts, allowing users to turn off CSP if they feel the benefits aren't worth their scripts breaking can be allowed.

I think CSP can be done in a way that provides benefit to most users and isn't going to cause problems for the rest.

Inline styling in wikitext is translated to inline styling in HTML

I think you need to run that past the various communities and not just the techy-types who watch for RFCs on MediaWiki wiki. I can already hear the cries for anger from editors on en.wp who would no longer be able to make pretty signatures, much less all the more-legitimate concerns provided on the RFC linked by Anomie.

ResourceLoader currently requires an inline script. And the way MediaWiki uses ResourceLoader, also adds embeds one stylesheet (depending on user settings).

The embedded stylesheet is the user.prefs module. This was cleaned up by me in eddbcab0 to no longer be required by default. But depending on user settings, will still be added to the page. It can probably be refactored to be served from a separate request instead, though this is non-trivial since we currently enforce external requests to be unauthenticated, and this module requires some user preferences that we'd have to make public. May need to figure a different way (or deprecate those prefs; or re-implement in a different way, e.g. with class names and an optional extra module that handles all possible values).

The inline scripts are of three types:

  • Load queue: List of module names to load on this page.
  • Page configuration: Map of configuration values specific to this page.
  • User data: Map of (private) user meta-data.
  • User data; Map of (private) authentication tokens.

None of these execute any "real" code. They just enqueue a function into RLQ, since it's triggered asynchronously from the startup script. The load queue could be trivially refactored to be in the DOM instead (e.g. an attribute somewhere). The others are harder to tuck away without bloating HTML and making access more expensive. We could use a non-javascript <script> that contains JSON that we parse on-demand. E.g. <script type="text/x-mw-rlconfig"> or some such.

(Putting back accessibility, since inline/obtrusive CSS/JS IS an accessibility issue.)

For the record:

https://www.w3.org/TR/CSP/#changes-from-level-1 says:

Individual inline scripts and stylesheets may be whitelisted via nonces (as described in §4.2.4 Valid Nonces) and hashes (as described in §4.2.5 Valid Hashes).

Bawolff subscribed.

From my perspective, I don't believe that inline styles should be banned as part of CSP. The security benefits (Which are very low. Mostly around preventing data-exfriltration) are not worth the user inconvienance.

The existing usages of inline styles in RL can be worked around (using the hash method). Other things should not use inline js, but very few other things do, and those could be dealt with once we are closer to having CSP support (which is very far away because nobody is working on it).

With that in mind, I'm going to mark this task declined.

The existing usages of inline styles in RL can be worked around (using the hash method).

TemplateStyles too, since it was decided that inline styles were the way to go there.