Page MenuHomePhabricator

CVE-2023-37304: XSS in DoubleWiki extension (Wikisource)
Closed, ResolvedPublicSecurity

Description

Steps to reproduce:

  1. Install mediawiki/extensions/DoubleWiki
  2. Create a page with the following content:
hello

<div id="align-fr" style="display:none;"><pre>
hello = hello" onclick="alert('XSS')
</pre></div>

[[fr:test]]
  1. After saving, add ?match=fr to the page URL
  2. Click "hello" on the page

This will pop up a JavaScript dialog.

image.png (2×3 px, 318 KB)

Explanation:

DoubleWiki allows displaying pages on different projects links side-by-side. This works on any page with interwiki links on a wiki where the extension is enabled. Example: https://en.wikisource.org/wiki/Bible_(King_James)/Genesis?match=fr

Optionally their content can be aligned using the <div …><pre> markup described above. Example: https://en.wikisource.org/wiki/Eskimo_Life/Authors_Preface?match=no (see the markup in edit mode: https://en.wikisource.org/w/index.php?title=Eskimo_Life/Authors_Preface&action=edit)

This is implemented using some very indiscriminate regexp replacements, and allows inserting pieces of HTML into places they wasn't supposed to go. This bug is caused by the code in DoubleWiki::addMatchingTags() method.

The code implementing alignment has a lot of regexp and string operations on HTML. I have only reviewed a very small part of it (until I found a bug). There are some ~200 lines of code (methods addMatchingTags(), matchColumns(), findParagraphs(), findSlices()) that all look very risky. The side-by-side rendering itself (getMangledTextAndTranslation()) does some regexp replacements as well, but they look a lot less scary.

(It was brought to my attention after I read T323376#8406975 and tried to figure out what that code does.)

Event Timeline

@matmarex that does sound like a reasonable mitigation to me. If possible though, it would be nice to keep the feature and disable the javascript popups. Do you think that would be possible from what you have reviewed?

No doubt it's possible, but ensuring that there are no bugs is much more work than finding just one bug, so I did not try. Maybe we could put some sanitizer on top of the buggy code instead, but I don't know how that will interact with all other parsing code.

@matmarex sanitizer sounds like the best option

@matmarex following up to see if you had a chance to look into the sanitizer. I don't see any other maintainers for this project who could chime in

also adding platform engineering and @Tpt in case they have some insight on this issue

Sorry, I didn't get a chance to work on this.

What a solution to this issue likely involves is a bit of a rewrite of any functions within DoubleWiki.php (the vast bulk of this extension's functionality) that manipulate and/or output html, likely the list @matmarex provided within the task description. There are probably a few ways to do this:

  1. Parser->parse() - likely too expensive and overkill for this
  2. Sanitizer::removeSomeTags() - also a bit expensive, but possibly more appropriate
  3. Use HTMLPurifier or similar - a few extensions already use this as a dep (and there's literally an extension just for using it) but nothing that's been reviewed or pushed to Wikimedia production as far as I can tell
  4. Review and rewrite the problematic regular expressions - this seems like a waste of time and effort given the previous options

Anyhow, whatever option is pursued, it seems like some basic unit tests should be a part of that effort, as there currently are none. And I suppose there is one final option, though I'd imagine we'd like to avoid it: un-deploy the extension from Wikimedia production.

@sbassett thanks for looking into this....it seems like the HTMLPurifier is our best bet, but should probably be reviewed before being used on WMF production if I'm reading that correctly. I definitely think undeploying the extension would harm some translation efforts and other popular uses of this extension, so I would only consider that as a last resort.

Mstyles changed Risk Rating from N/A to High.Mar 14 2023, 6:27 PM
Mstyles moved this task from Inbox to Security Issues to Triage on the Platform Engineering board.
Mstyles moved this task from In Progress to Watching on the Security-Team board.

the security team would like to start the process to get this extension undeployed as there has been no movement on fixing this high priority security issue by community members or WMF staff. I'll submit this extension to the code stewardship process, but that has been paused for the moment, so there will probably be no activity there for a while.

as there has been no movement on fixing this high priority security issue by community members or WMF staff.

Community members cannot fix security issues if they don't know about / cannot access security issues...
Which steps have been performed to find WMF staff to look into this?

@Aklapper the platform engineering team has been added along with any maintainers mentioned. I don't think there's any other staff members who could be tagged here

The PoC here sounds the same as the one on T131199#2342458 so possibly the never applied patch at F4100931 might help. [have not verified myself. Going solely on the PoC looking sort of similar].

That said, this extension is all sorts of sketch. Disabling is not an unreasonable course of action.

Community members cannot fix security issues if they don't know about / cannot access security issues...
Which steps have been performed to find WMF staff to look into this?

Sadly, there is only one listed maintainer for this extension, and they were added to this bug in January of this year. And there simply aren't really any other active maintainers of the DoubleWiki extension. And having a look at the recent master log, most commits are from libraryupgrader or l10n-bot, with a handful of coding standard updates, but no significant development or maintenance.

The PoC here sounds the same as the one on T131199#2342458 so possibly the never applied patch at F4100931 might help. [have not verified myself. Going solely on the PoC looking sort of similar].

I couldn't reproduce those issues at the time, though my testing may have been flawed. I've had a look at the old patch from that bug/paste and it's definitely a non-trivial review to get it up-to-date. I've been working through it a bit this morning, but a lot of maintenance work has been performed on DoubleWiki_body.php since that patch was first written (including moving it to includes/DoubleWiki.php), so it will probably take me a little while to properly walk through the conflicts and merge it. That being said, if we can get a sane, updated patch out of this, then it's probably worthwhile deploying it to production to see if it addresses most or all of the injection issues present.

That said, this extension is all sorts of sketch. Disabling is not an unreasonable course of action.

Agreed, especially since it apparently has no active maintainers and isn't incredibly widely-deployed.

Since we're considering undeploying the extension now, I'd like to again suggest removing just the vulnerable feature (aligning content between languages), which basically no one is using anyway, and keeping the rest. I would work on that if anyone is willing to merge such a patch.

I would work on that if anyone is willing to merge such a patch.

Sure, happy to review.

I would work on that if anyone is willing to merge such a patch.

Sure, happy to review.

This would be a decent temporary solution IMO, but it still doesn't solve the issue of an active maintainer to support future security (and other) bugs for this extension. If that piece cannot be solved within a reasonable timeframe, then I think T332850 is still likely the best path forward.

Security patch to remove the feature, plus three hardening patches to clean up the remaining code (not known to be vulnerable) that was doing regexp replacements on HTML or questionable HTML escaping:




I can submit the latter three for review on Gerrit later, or we may review here and apply them as security patches if you'd like.

@matmarex thank you so much for submitting these patches! is there anyone we can find as a reviewer to ensure we're moving in the right direction?

@matmarex we would prefer to keep these as patches and not review on Gerrit for security purposes

@matmarex thank you so much for submitting these patches! is there anyone we can find as a reviewer to ensure we're moving in the right direction?

I haven't reached out to anyone, other than the folks who subscribed themselves to this task and might be willing. Do you want me to advertise it on the WMF Slack or something?

@matmarex we would prefer to keep these as patches and not review on Gerrit for security purposes

Sure, no problem.

is there anyone we can find as a reviewer to ensure we're moving in the right direction?

Errr, I already volunteered to do the review? (T323651#8813472)

I did a visual review just now, they look fine, will do a bit of testing tomorrow before +2ing.

@Legoktm did you get a chance to do some testing? this could possibly be deployed today during the security deploy window

CR+2 for the first security patch from T323651#8814793. It should substantially reduce the risk of matchColumns(). There's still some sketchy regexp stuff happening within getMangledTextAndTranslation(), but we don't have any reported issues there, yet. I think we can security-deploy this and see how it goes, while maybe creating a corresponding public bug explaining that the match columns functionality for DoubleWiki will be indefinitely disabled. If the security patch seems stable after a day or two, we can likely push @matmarex' remaining code improvement patches through gerrit.

Security patch to remove the feature, plus three hardening patches to clean up the remaining code (not known to be vulnerable) that was doing regexp replacements on HTML or questionable HTML escaping:




I can submit the latter three for review on Gerrit later, or we may review here and apply them as security patches if you'd like.

The Security patch ONLY has been deployed

The Security patch ONLY has been deployed

Thanks! It would be nice if we could test that the reported payload no longer works in production. While testwiki seems to have ext:DoubleWiki enabled, it does not appear to be working there, which isn't really surprising. I have OS rights on the wikisources with my staff rights, apparently, but I hesitate to edit something like Bible_(King_James)/Genesis there, even if I could be rather quick about testing and then suppressing the edit.

Huh, interesting. On some fairly obvious pages (or what I think are fairly obvious) it does not?

https://test.wikipedia.org/wiki/Dog?match=en
https://test.wikipedia.org/wiki/Barack_Obama?match=en

etc.

Anyhow, I tested against Main_page: https://test.wikipedia.org/w/index.php?title=Main_Page&diff=570072&oldid=570071. I think I did that correctly and could not get the XSS payload within the task description to execute when attempting a ?match=en.

It requires the interlanguage link to be present (i just added one to dog so that works now)

Security patch to remove the feature, plus three hardening patches to clean up the remaining code (not known to be vulnerable) that was doing regexp replacements on HTML or questionable HTML escaping:




I can submit the latter three for review on Gerrit later, or we may review here and apply them as security patches if you'd like.

I submitted the second patch to Gerrit, it can be found here: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/DoubleWiki/+/924136
The last two patches did not apply when I tried. Not sure what happened

I submitted the second patch to Gerrit, it can be found here: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/DoubleWiki/+/924136
The last two patches did not apply when I tried. Not sure what happened

The final two patches depend upon the changes made within the security patch. So they'll have to wait to be merged until that patch is publicly released. Which we could do now (since it's already in Wikimedia production: T276237) or wait until the end of the quarter, when it gets mentioned within the next supplemental security release (T333626).

Change 932825 had a related patch set uploaded (by Mstyles; author: Bartosz Dziewoński):

[mediawiki/extensions/DoubleWiki@master] Remove vulnerable column alignment feature

https://gerrit.wikimedia.org/r/932825

Change 933666 had a related patch set uploaded (by Mstyles; author: Bartosz Dziewoński):

[mediawiki/extensions/DoubleWiki@master] Modernize HTML output

https://gerrit.wikimedia.org/r/933666

Change 933667 had a related patch set uploaded (by Mstyles; author: Bartosz Dziewoński):

[mediawiki/extensions/DoubleWiki@master] Use HtmlHelper::modifyElements() instead of regexps to modify links

https://gerrit.wikimedia.org/r/933667

Change 932825 merged by jenkins-bot:

[mediawiki/extensions/DoubleWiki@master] Remove vulnerable column alignment feature

https://gerrit.wikimedia.org/r/932825

Change 933666 merged by jenkins-bot:

[mediawiki/extensions/DoubleWiki@master] Modernize HTML output

https://gerrit.wikimedia.org/r/933666

Change 933667 merged by jenkins-bot:

[mediawiki/extensions/DoubleWiki@master] Use HtmlHelper::modifyElements() instead of regexps to modify links

https://gerrit.wikimedia.org/r/933667

Mstyles renamed this task from XSS in DoubleWiki extension (Wikisource) to CVE-2023-37304: XSS in DoubleWiki extension (Wikisource).Jun 30 2023, 5:47 PM
Mstyles changed the visibility from "Custom Policy" to "Public (No Login Required)".
Mstyles changed the edit policy from "Custom Policy" to "All Users".
matmarex claimed this task.
matmarex removed a project: Patch-For-Review.