Page MenuHomePhabricator

XSS via automatic infusion of OOjs UI widgets and specially prepared wikitext
Closed, ResolvedPublic

Description

Paste the following into a wiki page:

<div data-ooui='{"_":"ButtonWidget", "href":"javascript:alert()", "label":"Click me"}'>

It will create a clickable button that executes the given JavaScript.

Event Timeline

matmarex raised the priority of this task from to Needs Triage.
matmarex updated the task description. (Show Details)
matmarex changed the visibility from "Public (No Login Required)" to "Custom Policy".
matmarex changed the edit policy from "All Users" to "Custom Policy".
matmarex changed Security from None to Software security bug.
matmarex added subscribers: matmarex, Legoktm, cscott.

(mid-air collision)

Right now we should probably just disable the automatic infusion thing. not much relies on it yet, and it's supposed to be progressive enhancement anyway; it's in resources/src/mediawiki.page/mediawiki.page.ready.js.

I came upon this after noticing that 2be338cc99a9e8540f5be9c664fd40cc29654d9d allowed instantiation of arbitrary JS classes from wikitext (which is also bad, but probably not as bad as this, and not in production right now).

Things that we could do:

  • Disallow javascript: URLs in client-side code, like we already do in server-side code
  • Disallow data-ooui attribute in wikitext?
  • Disallow instantiation of arbitrary JS classes, should at least check if it's a OO.ui.Element and possibly use a whitelist

We should *not* automatically infuse all infusable-looking content. Why was that ever a good idea?

Infusion should be opt-in, and not all clients will want to infuse *everything* right away.

That said, there will probably be security issues like this in the future. For example, if I infuse #collection-widget on a Special page for the Collection extension, somebody may figure out how to insert a <div id=collection-widget> into a site notice or message or something. We should probably use a belt and suspenders here, and add data-ooui scrubbing to the Sanitizer.

matmarex renamed this task from XSS via automatic infusion and specially prepared wikitext to XSS via automatic infusion of OOjs UI widgets and specially prepared wikitext.Jul 9 2015, 11:00 PM

Things that we could do:

  • Disallow javascript: URLs in client-side code, like we already do in server-side code
  • Disallow data-ooui attribute in wikitext?
  • Disallow instantiation of arbitrary JS classes, should at least check if it's a OO.ui.Element and possibly use a whitelist

Probably all of these.

looks good to me for preventing data-ooui attributes from wikitext to start.

looks good to me for preventing data-ooui attributes from wikitext to start.

Eventually, I think that should be a configurable blacklist of data- elements to not allow. But for a security fix, I think it's ok as is.

This is the second example of xss via data attributes i can remember (there was a similar issue in tmh). Maybe we should have a new convention that all data usage not meant to be done by user should be data-mw-foo, and then blacklist data-mw- prefix.

looks good to me for preventing data-ooui attributes from wikitext to start.

Now deployed.

Since this problem was introduced very recently and is not in any release version, I think we should just commit the (already deployed) fix to master, and write and review the belt-and-suspenders patches (T105413#1443427) in the open. Not sure if we send any announcements for security bugs that only affected master for a few days?

Since this problem was introduced very recently and is not in any release version, I think we should just commit the (already deployed) fix to master, and write and review the belt-and-suspenders patches (T105413#1443427) in the open. Not sure if we send any announcements for security bugs that only affected master for a few days?

I think that's fine

  1. Disallow javascript: URLs in client-side code, like we already do in server-side code
  2. Disallow data-ooui attribute in wikitext?
  3. Disallow instantiation of arbitrary JS classes, should at least check if it's a OO.ui.Element and possibly use a whitelist

#1 I'm in favor of. The intended design of OOjs means there's no reason to use an href instead of an onclick handler.

#2 also seems reasonable, although @Bawolff's suggestion seems fine too. In Parsoid we do a transformation on content to protect our data-mw, data-parsoid, about and typeof attributes from being stomped on in content. (We prefix the attribute with data-x-, so if an attacker tried to add a data-mw attribute it would show up in rendered content as data-x-data-mw; we then strip than when serializing to wikitext). As a longer-term plan, standardizing on a scheme of that form might allow safe uses of data attributes without playing whack-a-mole.

#3 is currently posted as https://gerrit.wikimedia.org/r/224112 but I don't really see the point of it. It just makes it harder for third parties to use OOjs and write their own widgets without hacking our upstream whitelist. If you wanted to enforce this in the mediawiki-specific widgets library that would be one thing, but it seems out of place in a general purpose widget framework. That said, restricting infusion to subclasses of OO.ui.Element is less intrusive than a full whitelist. I don't have any problem with that version of #3 (and thus with the patch that @Legoktm posted), but I don't think it accomplishes much, either. It doesn't prevent the attack in this task's description.

This is the second example of xss via data attributes i can remember (there was a similar issue in tmh). Maybe we should have a new convention that all data usage not meant to be done by user should be data-mw-foo, and then blacklist data-mw- prefix.

I think we should do the opposite by whitelisting instead of blacklisting. All data for the user should be prefixed with "data-mw-" (or "data-user-") and only that prefix is whitelisted in wikitext. That way when we use upstream libraries (like OOUI!) we're still safe.

Legoktm changed the visibility from "Custom Policy" to "Public (No Login Required)".Jul 10 2015, 8:28 PM
Legoktm changed the edit policy from "Custom Policy" to "All Users".
Legoktm changed Security from Software security bug to None.

Change 224178 had a related patch set uploaded (by Legoktm):
SECURITY: Do not allow data-ooui attributes in wikitext

https://gerrit.wikimedia.org/r/224178

Change 224178 merged by jenkins-bot:
SECURITY: Do not allow data-ooui attributes in wikitext

https://gerrit.wikimedia.org/r/224178

This is the second example of xss via data attributes i can remember (there was a similar issue in tmh). Maybe we should have a new convention that all data usage not meant to be done by user should be data-mw-foo, and then blacklist data-mw- prefix.

I think we should do the opposite by whitelisting instead of blacklisting. All data for the user should be prefixed with "data-mw-" (or "data-user-") and only that prefix is whitelisted in wikitext. That way when we use upstream libraries (like OOUI!) we're still safe.

Parsoid prefixes all user data attributes (and some other attributes) with data-x-. I'm not sure we should whitelist any data attributes; most potential uses can still work fine if the name is tweaked slightly. I'd like it if PHP and Parsoid could come to agreement here.

Parsoid prefixes all user data attributes (and some other attributes) with data-x-. I'm not sure we should whitelist any data attributes; most potential uses can still work fine if the name is tweaked slightly. I'd like it if PHP and Parsoid could come to agreement here.

Ah, so you're suggesting that we just modify the data attribute name?

Parsoid prefixes all user data attributes (and some other attributes) with data-x-. I'm not sure we should whitelist any data attributes; most potential uses can still work fine if the name is tweaked slightly. I'd like it if PHP and Parsoid could come to agreement here.

@cscott, you're saying that when doing wikitext->html, convert the wikitext <div data-foo="bar"> to the html <div data-x-foo="bar">?

@Legoktm Yes. It lets people still use data attributes, just ensures they won't conflict with internal/trusted attribute names.

But I'd be interested in hearing from actual users of data attributes on MW wikis, if there are any.

It's possible if there are third party libraries which rely on data attributes, we could set up a precedent for configuring them with a specific "data attribute prefix", so they could still work on mediawiki content if configured correctly. At least that's the basic idea for a "preserve but escape" strategy.

@csteipp yes, more or less:

$ echo '<div about="foo" data-parsoid="bar" data-x-data-parsoid="bat"></div>' | tests/parse.js
...
<div data-x-about="foo" data-x-data-parsoid="bar" data-x-data-x-data-parsoid="bat" data-parsoid='{"stx":"html","dsr":[0,68,62,6]}'></div>
...

You can round-trip this, which is obviously of interest to Parsoid.

We don't currently escape *all* data attributes, just the ones Parsoid uses, but we could change this (especially if mw core agrees).

We have libraries in core that rely on data- attributes being passed through by the Parser, jquery.tablesorter is a big one (it uses data-sort-value and data-sort-type). Not sure what others, grepping for .data( ' and 'data- yields tons of stuff, but many are used in different context (either output by special pages, or used internally within the scripts only).

@matmarex right, that's why I was saying we could configure out 3rd-party libraries to use a prefix on the data attribute. If it's just tablesorter, this shouldn't be hard. Note that jquery's .data() doesn't actually use the data attribute, so that's irrelevant. And stuff output by special pages is fine, this is just stuff output by untrusted authored content.

Oh, and I should add that currently Parsoid escapes only data-parsoid, data-mw and data-x* (and a bunch of other semantic attributes), see https://github.com/wikimedia/parsoid/blob/master/lib/pegTokenizer.pegjs.txt#L1250-L1255

I believe that we are not escaping *all* data attributes currently precisely because of tablesorter et al.

So I am proposing two options:

  1. Core PHP adopts the "escape instead of remove" blacklist technique of Parsoid. The initial escape/blacklist would be data-x-*, data-parsoid, data-mw, and data-ooui, presumably. Or else....
  2. We change both Parsoid and PHP to be "secure by default" and escape all data attributes except a small whitelist (data-sort-* and other specific use cases, if/as we find them). We then work with upstream on some sort of "data prefix" patch so that we can eventually configure jquery.tablesorter to use a prefix and drive the whitelist size to zero.

Note that jquery's .data() doesn't actually use the data attribute, so that's irrelevant.

It does. It only reads from it, doesn't write to it, though. $( '<div data-foo="bar">' ).data( 'foo' ) === "bar".

We then work with upstream on some sort of "data prefix" patch so that we can eventually configure jquery.tablesorter to use a prefix and drive the whitelist size to zero.

For tablesorter, we are the upstream, actually.

Eh… I'm not sure about "escaping" data- attributes. They are supposed to be safe, it's just a bug in our JS that we don't properly treat them as user-provided content. It's like if we had a script that eval()'d the title attribute, would we disallow titles?

I think the problem is we went a bit overboard with the automatic infusion. Maybe we should limit it to descendants of <form> elements (which can't be generated from wikitext, barring extensions like InputBox)? Or only disallow one specific attribute, say data-mw, since then we'd be able to assume that anything inside a element with this attribute is safe (not user-provided)?

(By the way, this bug is pretty similar to T68608, I think. Having a way to mark elements as trusted/untrusted would have also helped there, if we had it at the time.)

Change 224202 had a related patch set uploaded (by Legoktm):
ButtonWidget: Check protocol of 'href' client-side against a whitelist

https://gerrit.wikimedia.org/r/224202

Or only disallow one specific attribute, say data-mw, since then we'd be able to assume that anything inside a element with this attribute is safe (not user-provided)?

Sounds easier?

csteipp added a project: Vuln-XSS.

Change 224202 merged by jenkins-bot:
Add OO.ui.isSafeUrl() to make sure url targets are safe client-side

https://gerrit.wikimedia.org/r/224202

Things that we could do:

  • Disallow javascript: URLs in client-side code, like we already do in server-side code

rGOJU4d8e37548388: Add OO.ui.isSafeUrl() to make sure url targets are safe client-side

  • Disallow data-ooui attribute in wikitext?

rMWaa9a52da42da: SECURITY: Do not allow data-ooui attributes in wikitext

  • Disallow instantiation of arbitrary JS classes, should at least check if it's a OO.ui.Element and possibly use a whitelist

rGOJU264b44b44fcb: Only allow construction of classes that extend OO.ui.Element in infusion

Anything else we want to do before marking this as resolved?

Anything else we want to do before marking this as resolved?

This all looks good from my perspective. @matmarex, anything else you were planning on doing here?

I was hoping that T107623 would get resolved in a timely manner, but it did not, so meh.

We should probably add a patch similar to rMWaa9a52da42da: SECURITY: Do not allow data-ooui attributes in wikitext to Parsoid, or else there might be a similar attack vector when VisualEditor is used to edit the page.

Change 240784 had a related patch set uploaded (by Cscott):
Do not allow data-ooui attributes in wikitext

https://gerrit.wikimedia.org/r/240784

Change 240784 merged by jenkins-bot:
Do not allow data-ooui attributes in wikitext

https://gerrit.wikimedia.org/r/240784

I think this can be closed and publicized now?

Oh, heh, it is public. Let's just do it then.