Page MenuHomePhabricator

CVE-2021-30458: Parsoid comment fostering allows for inserting mostly arbitrary <meta> tags
Closed, ResolvedPublicSecurity

Description

While treebuilding, Parsoid converts some nodes to html comments with json encoded values to try to prevent them from being fostered. However, other comments on the page, present in the input, may contain valid json (see T279182). If that json happened to have the expected property ("@type": "mw:..."), we might run into some trouble. For a similar reason, the tokenizer already protects data attributes on dom nodes from similar conflicts with data-mw/data-parsoid/etc

See WTUtils::fosterCommentData / WTUtils::reinsertFosterableContent

Introduced in 0d6c4aaf4e3621d576c7a15a84b243af71af793b (all released Parsoid versions).

Event Timeline

Arlolra moved this task from Needs Triage to Bugs & Crashers on the Parsoid board.
Arlolra updated the task description. (Show Details)
Legoktm raised the priority of this task from Low to Needs Triage.Apr 6 2021, 7:28 PM
Legoktm set Security to Software security bug.
Legoktm added projects: Security, Security-Team.
Legoktm changed the visibility from "Public (No Login Required)" to "Custom Policy".
Legoktm changed the subtype of this task from "Task" to "Security Issue".
Legoktm added a project: Vuln-XSS.
Legoktm subscribed.

@Arlolra said this might be exploitable as an XSS since it runs post-sanitization.

So it seems like you can only create <meta> tags that will end up in the body.

First tried: <!--{"@type":"mw:pwnd", "attrs": [["http-equiv", "refresh"],["content","1;url=https://example.org/"]]}-->, but Parsoid drops http-equiv because of the dash.

Then tried setting a referrer policy and ended up with <meta name="referrer" content="unsafe&amp;#x2D;URL" id="mwAg"/> because the dash again gets encoded.

I was able to create <meta property="og:title" content="some vandalism" id="mwAg"/> but haven't checked whether the OpenGraph spec allows those tags in the body.

So you can definitely inject mostly-arbitrary <meta> tags into generated HTML, just need to find a tag that does nefarious things to make it exploitable...

For testing:

curl -X POST "https://en.wikipedia.org/api/rest_v1/transform/wikitext/to/html" -H  "accept: text/html; charset=utf-8; profile="https://www.mediawiki.org/wiki/Specs/HTML/2.1.0"" -H  "Content-Type: multipart/form-data" -F 'wikitext=<!--{"@type":"mw:pwnd", "attrs": [["property", "og:title"],["content","some vandalism"]]}-->'

but Parsoid drops http-equiv because of the dash.

The contents of the comment will be encoded with WTUtils::encodeComment

Maybe worth hiding https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/677297/1#message-b9eae79b4f5b009c0aea6c24a01df33204c41406 and arlo's followup comments? Or does it just draw more attention to this?

I conferred with some other Security-Team members and we believe the disclosure within the gerrit comments is likely low-risk, especially since this bug is now protected.

Just so we track it all in one place, couple of notes. Besides a production deploy, we'll need to do the following as well.

  1. Issue a Parsoid/JS security release (nodejs v10 is EOL April 2021, but Parsoid/JS is technically not EOL till MW 1.31 LTS goes EOL). So, we need to do a security release on top of the 0.11.0 deb package.
  2. Issue a security release for MW 1.35 (parsoid/php).

Thanks! I know this is a WIP start, but instead of throwing an exception, dirty the comment (which will show up as a one-off dirty diff on edits, but it is probably okay), with some prefix (even reuse the data-x prefix if you want). As a later fix, we can try to prevent the dirty-diff. Throwing an exception will probably return a http 500 / 403 / ... right? In this case, we can just bypass the spoofing and render everything else just fine.

dirty the comment (which will show up as a one-off dirty diff on edits, but it is probably okay), with some prefix (even reuse the data-x prefix if you want)

It was my understanding that we can't change the comment length, less we mess up dsr computation (see WTUtils::decodedCommentLength() et al). We could do something like s/@type/@hype/ though, which I thought was meh.

Is there a benefit to throwing an exception and being notified of attempts at abuse? Whether intentional or not.

dirty the comment (which will show up as a one-off dirty diff on edits, but it is probably okay), with some prefix (even reuse the data-x prefix if you want)

It was my understanding that we can't change the comment length, less we mess up dsr computation (see WTUtils::decodedCommentLength() et al). We could do something like s/@type/@hype/ though, which I thought was meh.

True reg. DSR. So, yes, maybe 's/@type/abuse/'? or some such.

Is there a benefit to throwing an exception and being notified of attempts at abuse? Whether intentional or not.

That is a good thought about being notified. You can still log an entry in logstash that will flag it for review without blocking a render.

@cscott's suggestion is that since the comments coming from source get encoded, we can change the key the trick is using to something not found after the encoding so we'd never have a conflict, rather than munging the comment values and creating diffs.

New patch implements the suggestion from T279451#6979972

The switch from s/@type/-type/ is obscured in this slightly larger, related change.

That second patch looks good to me. Confirmed by reading WTUtils::encodeComment that [->&] in wikitext will be entity-encoded in the HTML comment value, so any attempt to write -type in wikitext will appear in HTML as &#x2D;type, which JSON decoding will interpret literally (not entity-decoded) and therefore user-generated content should not be able to masquerade as a fostered element. Note that since we don't encoding the comment containing the fostered content (this isn't changed with your patch, it's a pre-existing bug), it's possible the internal DOM will contain unserializable HTML -- that is, fostered content could contain a --> in such a way that the internal HTML comment in DOM also contains the string --> which would break the comment if the DOM was deserialized and reserialized. I *think* that's okay, but it makes me a little nervous -- HTML content can get embedded in data-mw for captions, etc. We need to be sure that any "fostered content" comments are dealt with before that serialization into data-mw occurs. I'd be a little more comfortable if we always encodeCommented the *value* of the -type key, which would ensure that --> could never appear and break the comment.

this isn't changed with your patch, it's a pre-existing bug

It's by design. The lifetime of these comments is that they're created during treebuilding and then in the first pass on the dom (preparedom) they are converted back to meta tags.

I broke removing the tunnelling option out into a separate patch,
https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/677592

And the patch is now pared down to,

@cscott, are you a +1 on that patch as well? I assume yes based on T279451#6980614 but an explicit nod for the latest version would be helpful.

So, how do we want (a) roll this out into production (b) get this into the security release planned for release this week (c) get this into a parsoid/js security release?

@Reedy, @sbassett your input would be helpful here for (a) and (b) especially.

So, how do we want (a) roll this out into production (b) get this into the security release planned for release this week (c) get this into a parsoid/js security release?

@Reedy, @sbassett your input would be helpful here for (a) and (b) especially.

For (a), I'm not familiar with the services/parsoid production deployment process, but security patch deployments should likely follow our current mediawiki core/ext/skins security patch deployment process. To wit: upload the patch to the relevant deployment server, apply the patch to a staging repo via git am, test on the mwdebugs if possible and then deploy affected files and watch the logs.

For (b), I believe the proper security release (for core and bundled extensions, tracked at T270458) is due out tomorrow, April 8th, 2021. I'd defer to @Reedy on how difficult it might be to include this issue within that release at this point. We could alternatively include this issue with the supplemental announcement I usually put together for non-core, non-production, non-bundled extensions, skins, etc: T270466, likely due to be released early next week. A completely separate announcement could also be made, though I'd personally prefer to avoid that option if possible.

So, how do we want (a) roll this out into production (b) get this into the security release planned for release this week (c) get this into a parsoid/js security release?

@Reedy, @sbassett your input would be helpful here for (a) and (b) especially.

For (a), I'm not familiar with the services/parsoid production deployment process, but security patch deployments should likely follow our current mediawiki core/ext/skins security patch deployment process. To wit: upload the patch to the relevant deployment server, apply the patch to a staging repo via git am, test on the mwdebugs if possible and then deploy affected files and watch the logs.

Specifically parsoid is deployed via mediawiki/vendor, so the patch should be made against that repository.

For (b), I believe the proper security release (for core and bundled extensions, tracked at T270458) is due out tomorrow, April 8th, 2021. I'd defer to @Reedy on how difficult it might be to include this issue within that release at this point.

Parsoid is bundled in the tarball via mediawiki/vendor, so the fix should go in the normal tarball release. It's a little tricky because we also need to push a new composer release/bump the version in composer.json for people installing Parsoid that way. If we're OK with it (given this seems to be a difficult to maliciously exploit XSS), I think we do the Parsoid composer release an hourish before the tarball release, bump mediawiki/vendor publicly for master and REL1_35. Alternatively, we use a patch for mediawiki/vendor, ship that in the tarball and in git, and then do the Parsoid release after all the security stuff is over.

We could alternatively include this issue with the supplemental announcement I usually put together for non-core, non-production, non-bundled extensions, skins, etc: T270466, likely due to be released early next week. A completely separate announcement could also be made, though I'd personally prefer to avoid that option if possible.

This probably makes sense for Parsoid/JS if we're OK with a gap in between the disclosure with Parsoid/PHP and the fix in Parsoid/JS.

Can we add a line to force $decode = false in WTUtils::reinsertFosterableContent ? I know it's not strictly necessary, since it is set to false in the one place that method is called, but it would be a bit of belt-and-suspenders. I'm +1. I assume there was an explicit discussion w/r/t not being subtle about the disclosure? The latest patch draws a big arrow pointed at the vulnerability. On the other hand, the main exploit path for <meta> tags seems to be http-equiv, and we're protected from that route by the same wikitext comment encoding mechanism.

And just for completeness, this is actually the line which protects -type:

Can we add a line to force $decode = false in WTUtils::reinsertFosterableContent ? I know it's not strictly necessary, since it is set to false in the one place that method is called, but it would be a bit of belt-and-suspenders.

This patch that's already been merged removes decoding in that function altogether,
https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/677592/1/src/Utils/WTUtils.php

And just for completeness, this is actually the line which protects -type:

Ok, added to the patch

Ok, sorry, I was a bit behind. So yes, I'm +1 to the latest patch, and my understanding is that we're going to (a) merge it in master, (b) cherry-pick that change to REL1_35 and thereby include it in the upcoming security release, (c) possibly cherry-pick the patch to wmf.37 and SWAT it, and (d) port this patch to Parsoid/JS and do a release bump of the NPM package as well. Yes?

I've added links back to this task in the php comments in this version, in case we don't remember what those warnings are talking about,

We can add the $decode = false; to the backport patch for REL1_35 when that's created. I'm going to work on the Parsoid/JS patch next.

@Arlolra can you work with @Legoktm to get this applied ASAP in production?

@subbu and I are going to work out a schedule with @Reedy for tomorrow's security release. Assuming this is fixed in production before tomorrow, I'll upload the patch to Parsoid master and then cherry pick to Parsoid's REL1_35 branch *no sooner than* an hour or so before the security release and tag v0.12.2 on composer for @Reedy to use. I'll also tag v0.13.0-a32 at the same time and make a mediawiki-vendor patch which we'll merge tomorrow. @subbu's going to rt test that overnight. That will also be our "weekly" release for next week, since Friday and Monday are holidays.

We can add the $decode = false; to the backport patch for REL1_35 when that's created.

Here's a patch against REL1_35 with that extra protection included,

Here's a patch against REL1_35 with that extra protection included,

@Legoktm points out that this will apply cleaner to production right now, since we haven't deployed T279451#6981308 yet, so that's what we're going with

We can add the $decode = false; to the backport patch for REL1_35 when that's created.

Here's a patch against REL1_35 with that extra protection included,

I applied this against mediawiki/vendor wmf/php-1.36.0-wmf.38:

patching file src/Utils/WTUtils.php
Hunk #1 succeeded at 725 (offset 25 lines).
Hunk #2 succeeded at 791 (offset 25 lines).
Hunk #3 succeeded at 814 with fuzz 2 (offset 26 lines).
patching file src/Wt2Html/Grammar.pegphp
Hunk #1 succeeded at 677 (offset 20 lines).
patching file src/Wt2Html/Grammar.php
Hunk #1 succeeded at 562 with fuzz 2 (offset 25 lines).

Please verify this patch looks good (also prefixed patch with [SECURITY]):

Please verify this patch looks good (also prefixed patch with [SECURITY]):

LGTM

Deployed it to wmf.37 and wmf.38 with Arlo's assistance. Before syncing it to the full cluster I pulled it to just wtp1025 and then ran curl tests against it:

$ curl -H 'Host: test.wikipedia.org' "http://wtp1025.eqiad.wmnet/w/rest.php/test.wikipedia.org/v3/transform/wikitext/to/html" -X POST -H  "accept: text/html; charset=utf-8; profile="https://www.mediawiki.org/wiki/Specs/HTML/2.1.0"" -H  "Content-Type: multipart/form-data" -F wikitext='\<!--{"@type":"mw:pwnd", "attrs": [["property", "og:title"],["content","some vandalism"]]}-->'
<!--{"@type":"mw:pwnd", "attrs": [["property", "og:title"],["content","some vandalism"]]}-->
$ curl -H 'Host: test.wikipedia.org' "http://wtp1025.eqiad.wmnet/w/rest.php/test.wikipedia.org/v3/transform/wikitext/to/html" -X POST -H  "accept: text/html; charset=utf-8; profile="https://www.mediawiki.org/wiki/Specs/HTML/2.1.0"" -H  "Content-Type: multipart/form-data" -F wikitext='\<!--{"-type":"mw:pwnd", "attrs": [["property", "og:title"],["content","some vandalism"]]}-->'
<!--{"&#x2D;type":"mw:pwnd", "attrs": [["property", "og:title"],["content","some vandalism"]]}-->

And then tried against wtp1026, which did not yet have the code and could exploit it and generate a <meta> tag. It should be deployed everywhere now.

I'm going to work on the Parsoid/JS patch next.

Here's a Parsoid/JS patch against v0.11.0

Legoktm renamed this task from Protect against comments conflicting with the foster comment strategy to Parsoid comment fostering allows for inserting mostly arbitrary <meta> tags.Apr 7 2021, 11:06 PM
Legoktm triaged this task as High priority.
Legoktm updated the task description. (Show Details)

I'm going to work on the Parsoid/JS patch next.

Here's a Parsoid/JS patch against v0.11.0

+1 .. Are you preparing a deb or do you want me to?

Legoktm renamed this task from Parsoid comment fostering allows for inserting mostly arbitrary <meta> tags to CVE-2021-30458: Parsoid comment fostering allows for inserting mostly arbitrary <meta> tags.Apr 8 2021, 3:07 AM
Reedy changed the visibility from "Custom Policy" to "Public (No Login Required)".

For reference, a 0.11.1 deb was released on releases.wikimedia.org and on npm: https://www.npmjs.com/package/parsoid

I also submitted this for the PHP security-advisories database: https://github.com/FriendsOfPHP/security-advisories/pull/549