Page MenuHomePhabricator

Upload filter insufficient to stop XSS
Closed, ResolvedPublic

Description

Author: abarth-wikimedia

Description:
The XSS vulnerability is a follows:

  1. A malicious user uploads an specially crafted image to Mediawiki.
  2. The detectScript function in SpecialUpload.php incorrectly determines that the file is safe for upload.
  3. The user visits a malicious web site in Internet Explorer 6 or Internet Explorer 7.
  4. The web site navigates the user directly to the image URL.
  5. The browser sniffs the contents of the image as HTML and executes the JavaScript in the image in Wikipedia's security origin.

I can post an example image to this bug if you like, but I wasn't sure if you wanted me to upload a working proof-of-concept exploit to the bugtracker.

The issue is that the detectScript function only checks for a subset of the byte sequences that cause Internet Explorer's to sniff HTML. We are working to create a complete list of these byte sequences and will inform you when we have done so.

Also troubling is the portion of the function following the comment "look for javascript." This code misses any number of places JavaScript might occur in an HTML document. Hopefully this code will not be needed once we make the first part of this function comprehensive.


Version: 1.14.x
Severity: normal

Details

Reference
bz15895

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:19 PM
bzimport set Reference to bz15895.
bzimport added a subscriber: Unknown Object (MLST).

abarth-wikimedia wrote:

Patch to extend black list

Mediawiki already has a black list of dangerous file contents. The simplest fix is to extend the black list. I've attached an (untested) patch that extends the black list to all the tags we know to trigger content sniffing in IE and Firefox.

We're still working on understand the content sniffing algorithms in various browsers, and we'll let you know if we discover other pitfalls you should be aware of.

Attached:

CC'ing Tim on this, with his recent work on IEContentAnalyzer.

Adam, can you check against current 1.13.3 / SVN trunk (1.14 dev)? As far as we know we have a pretty complete set against IE currently.

The strings "<hr", "<p", "</p", etc. are in urlmon.dll but they're harmless due to a bug in the heuristic code that's meant to enable them when 2 or more instances occur. All tags that actually work are in the list in IEContentAnalyzer.php.

abarth-wikimedia wrote:

I took a quick look at IEContentAnalyzer. Very impressive. I spotted a few minor inconsistencies with our understanding of IE7's content sniffing algorithm. You might be interested in our work reverse engineering the content sniffing algorithms of Firefox, Safari, and Chrome as well:

http://webblaze.cs.berkeley.edu/2009/content-sniffing/

We would like to make sure we have the same signatures for IE because any inconsistencies are either bugs in IEContentAnalyzer or bugs in our model. Please let us know if you find anything that looks incorrect.

jcaballero wrote:

I took a look at IEContentAnalyzer.php and our own model.
Our models are mostly identical, which is pretty impressive on both sides
given all the quirks in IE's algorithm.
To summarize, I found 4 discrepancies.
We have fixed one signature in our web and here are the other 3 that you might want to fix:

image/bmp

Bytes 8 and 9 should zero, rather than different than zero

image/gif

The signature is case insensitive, rather than case sensitive

application/macbinhex40

The signature is case sensitive, rather than case insensitive

(In reply to comment #7)

I took a look at IEContentAnalyzer.php and our own model.
Our models are mostly identical, which is pretty impressive on both sides
given all the quirks in IE's algorithm.
To summarize, I found 4 discrepancies.
We have fixed one signature in our web and here are the other 3 that you might
want to fix:

image/bmp

Bytes 8 and 9 should zero, rather than different than zero

Confirmed.

.text:78152D71 xor edi, edi
.text:78152D73 inc edi
...
.text:78152D9D cmp [esi+8], ax
.text:78152DA1 jnz short return_zero
.text:78152DA3 jmp short return_edi

image/gif

The signature is case insensitive, rather than case sensitive

Confirmed. Calls StrCmpNICA which is case-insensitive.

application/macbinhex40

The signature is case sensitive, rather than case insensitive

Confirmed. Calls StrCmpNCA which is case-sensitive.

This is really great, to have the two of us do this independently and then be able to compare the results. You couldn't ask for a more thorough treatment, short of IE going open source and factoring out their own code for us to use.

I don't think any of those three changes are exploitable in the typical use case in MediaWiki, so I'll just commit them to the development branch.