Page MenuHomePhabricator

Allow "html" in exif tags
Closed, ResolvedPublic

Description

Every once in a while, photographer put something like this into their exif fields "<a href=example.com>Photo by me</a>"
This won't work for obvious reasons, but still prevents users to upload the file "for security reasons".


Version: unspecified
Severity: enhancement
See Also:

Details

Reference
bz25707

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 21 2014, 11:18 PM
bzimport added a project: MediaWiki-Uploading.
bzimport set Reference to bz25707.
bzimport added a subscriber: Unknown Object (MLST).
DieBuche created this task.Oct 29 2010, 4:26 PM

Bryan.TongMinh wrote:

The security reason is that IE may get fooled into thinking that this is actually an HTML file and try to display it, executing any embedded JS in the process.

Theoretically we could perhaps strip the html tags in exif fields. That would require a general means of editing exif tags. However we'll probably eventually need that anyways if we plan to fix Bug 20326.

Oh, sorry, maybe I wasn't clear enough. I'm aware of the script issue, but would it still be a concern if we only disallowed <script> and <iframe> tags, and let <a> or <img> pass?

That would mean tying an html parser into the filetype detection system.

I guess in theory it could be done with a whitelisting of several html tags, and stripping other tags as well as dangerous css from style=""... That's not a straightforward thing to implement though. Also, the solution would of course be far from HTML, and people would probably be asking for every single HTML tag they think might be useful to them :D

Bryan.TongMinh wrote:

Well, only <script> and style= should be blacklisted right?

how about IEs filter in style= ? And <link> elements of course, inline images, applet, iframe. There are many things in HTML that can potentially be dangerous.

Also if we were filtering html from the file, it'd be kind of weird to filter some html, then of the html we let in, not allow it to be used on the metadata box on the image page (With our current super-weird mix of first doing specialhtmlchars() on (most, not all of) the exif values, and then feeding the result of that into the parser.)

We don't arbitrarily filter some HTML. We have code to predict whether IE will think a file is HTML or not (based on Tim's reengineering of the IE MIME type detection code) and filter based on that.

Ok, so (From my understanding):
*IE only looks at the first 255 bytes of a file
*The EXIF standard allows arbitrary whitespace at the beginning of the exif application segment (right after the tiff header).

Proposed solution:
If we get a jpeg that fails the check, add about 255 bytes of whitespace, change the offsets for all the exif pointers, and see if it still fails the check. This of course would need to be tested to see if image viewers accept the arbitrary white space in practise and so on.

Bryan.TongMinh wrote:

(In reply to comment #9)

Proposed solution:
If we get a jpeg that fails the check, add about 255 bytes of whitespace,
change the offsets for all the exif pointers, and see if it still fails the
check. This of course would need to be tested to see if image viewers accept
the arbitrary white space in practise and so on.

Sounds reasonable to me, but this is a major change from how we have previously handled: previously your file after upload was more or less guaranteed to be exactly equal to that before upload. Now we are essentially losing the original file. I don't think we should care about this, but it is something to take in mind.

cc Tim Starling for security review of the proposed solution

Tgr added a comment.Oct 2 2014, 8:41 AM

Alternatively just disable logins in IE6 (and 7?). As long as the user can't log in, allowing arbitrary script execution on upload.wikimedia.org should be harmless. Disabling logins for old and insecure browsers was discussed in bug 56575.

Jdforrester-WMF moved this task from Untriaged to Backlog on the Multimedia board.Sep 4 2015, 6:33 PM
Restricted Application added subscribers: Steinsplitter, Matanya, Aklapper. · View Herald TranscriptSep 4 2015, 6:33 PM
gpaumier removed a subscriber: gpaumier.Jul 18 2018, 5:57 PM

Is this still a security concern?
Do we have any documented/recommended process on how to deal with this with imports? Will python html.escape be enough to avoid this issue?

Peachey88 updated the task description. (Show Details)Aug 5 2018, 8:50 AM
Peachey88 removed a subscriber: wikibugs-l-list.
Aklapper lowered the priority of this task from Normal to Low.Oct 22 2018, 12:10 PM

According to @matmarex this is an issue in IE5-7. MS dropped support for IE7 entirely in April 2016. IE8 was released in 2009. According to https://en.wikipedia.org/wiki/Internet_Explorer_7, as of May 2012, estimates of IE7's global market share were 1.5-5%. According to netmarketshare.com the market share is now 0.16%.

So as I said in T212693: remove line 1332 ("'<a href',") from UploadBase.php and we're basically done.

TheDJ removed a subscriber: TheDJ.Dec 31 2018, 8:07 PM

The only relevant data is https://www.mediawiki.org/wiki/Compatibility#Browser_support_matrix : We provide basic support for IE6+.

Clpo13 added a subscriber: Clpo13.Jan 25 2019, 8:22 PM
4nn1l2 added a subscriber: 4nn1l2.Jan 25 2019, 11:39 PM

At what point does this IE security issue manifest itself, during the upload process? The file is being uploaded, not being sent to the browser to be displayed.

I suppose I misinterpreted the comment about disabling logins. The idea would be that somebody would craft a JPEG (or other file type, presumably) file with some embedded Javascript, and some other user would come along with some old version of IE that would bizarrely ignore the MIME type when displaying the full-size file, execute the Javascript, and do some stupid thing under their account?

Editing Exif without potentially breaking something is difficult. A lot of extensions have been added to JPEG over the years, and some of them contain pointers to data outside their own segments. Things like maker notes (proprietary and not fully documented) and Multi-Picture Format (which allows putting multiple images in a single JPEG file). They all need to be unpacked, adjusted, and repacked. The only practical method is probably to use ExifTool, and even that isn't perfect.

Jeff_G added a subscriber: Jeff_G.Jan 27 2019, 10:23 AM
AlexisJazz added a comment.EditedJan 28 2019, 4:07 PM

Editing Exif without potentially breaking something is difficult. A lot of extensions have been added to JPEG over the years, and some of them contain pointers to data outside their own segments. Things like maker notes (proprietary and not fully documented) and Multi-Picture Format (which allows putting multiple images in a single JPEG file). They all need to be unpacked, adjusted, and repacked. The only practical method is probably to use ExifTool, and even that isn't perfect.

And even that would break FlickreviewR, so it'll put additional load on our license reviewers. Perhaps it would be possible to disable downloads of full-size images for IE6 and IE7 users and run the silly check during upload on a thumbnail instead of the full image. The thumbnailer already strips most EXIF data and could perhaps be adjusted to escape the COPYRIGHT field which it doesn't strip, presumably for legal reasons.

Are IE6 and IE7 worth such effort? Well.. Dropping support is considerably easier. Microsoft did it, but Wikimedia can't?

Perhaps it would be possible to disable downloads of full-size images for IE6 and IE7 users and run the silly check during upload on a thumbnail instead of the full image.

Indeed, I was going to suggest this as well.

It might also be possible to force original files to be handled as downloads, rather than shown in the browser, depending on the browser.

Are IE6 and IE7 worth such effort? Well.. Dropping support is considerably easier. Microsoft did it, but Wikimedia can't?

In this case dropping support doesn’t just mean “the site will not work”, but “your account might be taken over by an attacker if you use this browser”, which in my opinion makes this a different matter. We did partially drop support for these browsers, e.g. we no longer write our JavaScript code for them.

It's a bit hard to find information about this problem online, since it affects browsers that became obsolete in 2009 when a newer version of IE was released. Perhaps Microsoft even patched the bug in the old versions, if they supported them for several years afterwards?

I found an article http://www.h-online.com/security/features/Risky-MIME-sniffing-in-Internet-Explorer-746229.html which describes the problem and gives some interesting information. It only affects the browser when the user goes directly to the file, as in downloading, like https://upload.wikimedia.org/wikipedia/commons/e/e9/Dehallen.jpg. It doesn't affect images when embedded in HTML pages.

Also, if the file extension, MIME type, and file signature (the first few byes of the file) are all in agreement, the browser won't do it's crazy scanning thing. I don't know what file extensions it accepts. It may be possible to disable the check for the problem is the file extension is good.

This is surely a bug that nobody would bother actually trying to exploit. Why go to the effort, when the chance that one of the users of these old browsers will actually visit the direct link of your file is practically zero? And all you can do is Wiki actions, which are reversible anyway.

The solution suggested by Tgr seems best, far better than making dodgy file changes to avoid something that will probably never happen.

brion added a subscriber: brion.Jan 28 2019, 10:58 PM

The chance that someone would visit using an old IE version was probably 50% when the code was originally added; IE had a very high marketshare in the early 2000s. However at this point we can't even be accessed in IE 6 as far as I know (due to servers dropping old TLS versions for HTTPS). I think it's pretty fair to change the balance of what we check for.

Double-checking:

  • IE 6 and 7 were both available on Windows XP, which includes TLS 1.0 support but not TLS 1.1 or 1.2. However IE 6-8 on XP fail to work anyway due to lack of SNI.
    • This takes IE 6 off the table entirely.
  • IE 7 on Vista should still work correctly with TLS 1.0, 1.1, or 1.2 -- https://blogs.msdn.microsoft.com/kaushal/2011/10/02/support-for-ssltls-protocols-on-windows/
  • IE8 supports X-Content-Type-Options: nosniff header; if we don't already use it consistently, applying this on all file views would resolve all sniffing issues on IE 8

So the intersection of the danger zone, for Wikimedia sites, is:

  • IE 7 running on Windows Vista, user logged in

(This would be a bad experience for the user with no JS and probably no or broken styles, and it's going to be a very small percentage of people that will hit it ever.)

For small internal sites, I assume you're just not going to have IE 6/7 so it shouldn't be a problem. ;)

I think it's reasonably fair to just drop the anti-sniffing checks at this point, though we might want to consider explicitly disallowing logins on IE 7.

The other thing to consider is IE7 on vista is old. AFAICT its been out of support at least since 2016. There are probably other security vulnerabilities its subject to (Some googling is suggestive that CVE-2018-8653 might be an example, but its really unclear)

Does it really matter to care about XSS on an ancient browser that is probably subject to RCE vulnerabilities? I don't think it does.

My 2 cents: I'm supportive of removing the check. I don't think its worth the effort of banning logins from IE7.

It might also be possible to force original files to be handled as downloads, rather than shown in the browser, depending on the browser.

I was thinking here of using Content-Disposition: attachment HTTP header, which seems to be supported by IE 6 and similarly old browsers (http://www.jtricks.com/bits/content_disposition.html). I think adding a few lines of code somewhere to produce this header when responding to old user-agents would be relatively easy, and should resolve any security concerns (for Wikimedia's hypothetical users on IE 7 on Vista, and for any third-party wikis that can be viewed by other old IE versions).

AlexisJazz added a comment.EditedJan 30 2019, 7:52 AM

Double-checking:

  • IE 6 and 7 were both available on Windows XP, which includes TLS 1.0 support but not TLS 1.1 or 1.2. However IE 6-8 on XP fail to work anyway due to lack of SNI.
    • This takes IE 6 off the table entirely.
  • IE 7 on Vista should still work correctly with TLS 1.0, 1.1, or 1.2 -- https://blogs.msdn.microsoft.com/kaushal/2011/10/02/support-for-ssltls-protocols-on-windows/
  • IE8 supports X-Content-Type-Options: nosniff header; if we don't already use it consistently, applying this on all file views would resolve all sniffing issues on IE 8

So the intersection of the danger zone, for Wikimedia sites, is:

  • IE 7 running on Windows Vista, user logged in

(This would be a bad experience for the user with no JS and probably no or broken styles, and it's going to be a very small percentage of people that will hit it ever.)
For small internal sites, I assume you're just not going to have IE 6/7 so it shouldn't be a problem. ;)
I think it's reasonably fair to just drop the anti-sniffing checks at this point, though we might want to consider explicitly disallowing logins on IE 7.

Just for fun (don't take this seriously), to exploit it, this is what we need:

  • Our target must use Windows Vista. Use on Wikimedia: 0.0016327531248557865% or 16 users per million.
  • They must also use IE7. In July 2017, this was 0.0010794218871612216%. In december 2018 it was 0.0004178927795254156% (assuming I'm reading "4.178927795254156E-4" correctly) or 4 users per million.
  • From this 0.0004% pool, our target must be logged in.
  • We must convince our target to visit a direct link to the full image as embedded images won't work.
  • As the file extension is restricted to a whitelist, the MIME type beyond the control of the attacker and invalid file signatures are also rejected, we must.. find a way around that I guess?
  • So we hack Wikimedia and reconfigure their webservers, easy peasy!
  • We also need to fool the virus scanner, but we might as well disable the virus scanner during the Wikimedia hack, so that's no problem, right?
  • We will overwhelm the community with trolls so our file can slip under their radar and won't be deleted by an admin.
  • Having completed this simple plan, we will insert the word "poop" in a Wikipedia article using someone else's account, only to be instantly reverted by ClueBot NG. Damnit!

It's nothing short of a Zany Scheme! :-D

As the file extension is restricted to a whitelist, the MIME type beyond the control of the attacker and invalid file signatures are also rejected, we must.. find a way around that I guess?

Usually in this scenario, its assumed file ext is .jpg, and the normal unix mime detection stuff would detect it as JPEG. But IE7 would in theory still treat as html (I think).

As the file extension is restricted to a whitelist, the MIME type beyond the control of the attacker and invalid file signatures are also rejected, we must.. find a way around that I guess?

Usually in this scenario, its assumed file ext is .jpg, and the normal unix mime detection stuff would detect it as JPEG. But IE7 would in theory still treat as html (I think).

Actually no, the article Ghouston linked explains:

With the common GIF, JPEG and PNG formats, the browser ignores the result of MIME sniffing, as long as the filename extension, Content-Type and signature, all indicate the same type. Only if the results are inconsistent will Internet Explorer handle the file as the type identified by MIME sniffing.

So we really need to follow the Zany Scheme to exploit this. The scheme is a joke, but not inaccurate.

As the file extension is restricted to a whitelist, the MIME type beyond the control of the attacker and invalid file signatures are also rejected, we must.. find a way around that I guess?

Usually in this scenario, its assumed file ext is .jpg, and the normal unix mime detection stuff would detect it as JPEG. But IE7 would in theory still treat as html (I think).

Actually no, the article Ghouston linked explains:

With the common GIF, JPEG and PNG formats, the browser ignores the result of MIME sniffing, as long as the filename extension, Content-Type and signature, all indicate the same type. Only if the results are inconsistent will Internet Explorer handle the file as the type identified by MIME sniffing.

So we really need to follow the Zany Scheme to exploit this. The scheme is a joke, but not inaccurate.

I think the microsoft docs say something different then Ghouston does:

If the server-provided MIME type is either known or ambiguous, the buffer is scanned in an attempt to verify or obtain a MIME type from the actual content. If a positive match is found (one of the hard-coded tests succeeded), this MIME type is immediately returned as the final determination, overriding the server-provided MIME type (this type of behavior is necessary to identify a .gif file being sent as text/html). During scanning, it is determined if the buffer is predominantly text or binary. Note In Microsoft Internet Explorer 6 for Windows XP Service Pack 2 (SP2), the MIME type "text/plain" is not ambiguous, and is never rendered as HTML in the restricted zone, even if the content suggests that this is the correct format.

From https://docs.microsoft.com/en-us/previous-versions/windows/internet-explorer/ie-developer/platform-apis/ms775147(v=vs.85)

Otherwise, this would basically be a non-issue even at the time [Additional cause for concern is I think IE6 can sometimes take extensions from query parameter part of the url]

AlexisJazz added a comment.EditedJan 30 2019, 9:45 AM

As the file extension is restricted to a whitelist, the MIME type beyond the control of the attacker and invalid file signatures are also rejected, we must.. find a way around that I guess?

Usually in this scenario, its assumed file ext is .jpg, and the normal unix mime detection stuff would detect it as JPEG. But IE7 would in theory still treat as html (I think).

Actually no, the article Ghouston linked explains:

With the common GIF, JPEG and PNG formats, the browser ignores the result of MIME sniffing, as long as the filename extension, Content-Type and signature, all indicate the same type. Only if the results are inconsistent will Internet Explorer handle the file as the type identified by MIME sniffing.

So we really need to follow the Zany Scheme to exploit this. The scheme is a joke, but not inaccurate.

I think the microsoft docs say something different then Ghouston does:

If the server-provided MIME type is either known or ambiguous, the buffer is scanned in an attempt to verify or obtain a MIME type from the actual content. If a positive match is found (one of the hard-coded tests succeeded), this MIME type is immediately returned as the final determination, overriding the server-provided MIME type (this type of behavior is necessary to identify a .gif file being sent as text/html). During scanning, it is determined if the buffer is predominantly text or binary. Note In Microsoft Internet Explorer 6 for Windows XP Service Pack 2 (SP2), the MIME type "text/plain" is not ambiguous, and is never rendered as HTML in the restricted zone, even if the content suggests that this is the correct format.

From https://docs.microsoft.com/en-us/previous-versions/windows/internet-explorer/ie-developer/platform-apis/ms775147(v=vs.85)
Otherwise, this would basically be a non-issue even at the time [Additional cause for concern is I think IE6 can sometimes take extensions from query parameter part of the url]

It's actually not "what Ghouston says" (unless Ghouston owns h-online.com).

I just actually tested it. I'm crazy. ietest-htmlsig.jpg starts with HTML. If this file is served as image/jpeg to IE6 on Windows XP (that's what I happened to have available..), IE6 does the dumb thing and render the HTML. But this file couldn't be uploaded to Wikimedia anyway. For that (assuming the filter we are discussing is removed), you need ietest64-jpgsig.jpg which starts with 64 bytes of https://commons.wikimedia.org/wiki/File:Example.jpg followed by 327 bytes of HTML crap, followed by the rest of Example.jpg. When this file is served to IE6 as image/gif with a .gif extension, the HTML is rendered. But guess what happens when you serve this image to IE6 as image/jpeg and .jpg file extension? IE6 renders the image. Exploit failed.

This comment was removed by AlexisJazz.
AlexisJazz added a comment.EditedJan 30 2019, 2:19 PM

The chance that someone would visit using an old IE version was probably 50% when the code was originally added; IE had a very high marketshare in the early 2000s. However at this point we can't even be accessed in IE 6 as far as I know (due to servers dropping old TLS versions for HTTPS). I think it's pretty fair to change the balance of what we check for.

I can confirm Wikipedia doesn't work (at all) on Windows XP with IE6. Page cannot be displayed.

I found the code addition: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/aefc40d749407714181548f401abde50009c3d0f%5E%21/includes/upload/UploadBase.php in 2009.

The two tags for Safari (<html and <script) are also only exploitable if the webserver is not configured correctly, so not exploitable on Wikimedia projects anyway. Unknown at this point which Safari versions are affected by that, but it's a theoretical risk anyway.

Tgr added a comment.Jan 30 2019, 6:01 PM
  • This takes IE 6 off the table entirely.

IIRC we still get a bunch of IE6/7 visits from places which man-in-the-middle SSL connections.

Does it really matter to care about XSS on an ancient browser that is probably subject to RCE vulnerabilities?

Disabling logins would also prevent attacks using other kinds of vulnerabilities. (Well, attacks targeting the account, anyway. There is not much point in worrying about attacks targeting the user's machine, since that can be done without involving Wikimedia web properties at all.)

In T27707#4918648, @Tgr wrote:
  • This takes IE 6 off the table entirely.

IIRC we still get a bunch of IE6/7 visits from places which man-in-the-middle SSL connections.

Logins should definitely be disabled for those. That man (the one in the middle) can collect all their passwords. And to simplify things, those 4 users per million using IE7 on Vista should also have logins disabled. So logins can be disabled for all IE6 and IE7 users.

Though this is not needed to remove this MIME sniffing check (or at the very least, make it optional), as I've shown it can't be exploited unless the webserver is misconfigured. So it doesn't apply to Wikimedia projects.

Logins should definitely be disabled for those.

This feels off-topic for this task. See https://wikitech.wikimedia.org/wiki/HTTPS/Browser_Recommendations#For_Users_of_Microsoft_Windows etc.

AlexisJazz added a comment.EditedFeb 1 2019, 1:53 PM

Logins should definitely be disabled for those.

This feels off-topic for this task. See https://wikitech.wikimedia.org/wiki/HTTPS/Browser_Recommendations#For_Users_of_Microsoft_Windows etc.

That's probably true. But here's the question: should this odd MIME sniffing check be made optional so sites with properly configured webservers (or those who don't care about obsolete browsers) can disable it, or should it be removed altogether?

brion added a comment.Feb 1 2019, 5:18 PM

@Aklapper I believe it's very on-topic to discuss the security implications of a suggested feature change. Where would you suggest we discuss this if not here, on this task?

brion added a comment.Feb 1 2019, 5:41 PM

Though this is not needed to remove this MIME sniffing check (or at the very least, make it optional), as I've shown it can't be exploited unless the webserver is misconfigured. So it doesn't apply to Wikimedia projects.

Can you clarify what you mean by "misconfigured"? No special configuration was required with exploit PNGs when I first tested this in 2005. ISTR that JPEG and GIF were not vulnerable when served with their native Content-Types because they have a different priority than PNG in the type checking.

The original 'naughty.png' sample from my 2005 testing is now available at https://brionv.com/misc/naughty.png -- note the way I embedded the HTML using a custom ancillary chunk might no longer be considered legit as some apps reject it now, but it could likely be retooled using a standard comment chunk. Breakdown of it is like this:

IE 8 on Windows 7 doesn't seem to be vulnerable and displays the image without running the script, but I still don't have an IE 6 or 7 test environment handy.

brion added a comment.Feb 1 2019, 6:39 PM

Ok, looking at the actual current code now... UploadBase::detectScript does the check, and combines several things:

  • looks at first 1024 bytes (more than IE checks) if binary, or all if text
  • does some text encoding checks (seems to be SVG-specific?)
  • looks for IE's trigger tags for type detection
  • looks for some Safari (old Safari?) trigger works for plain-specific type detection
  • decodes XML-style char references (seems to be SVG-specific)
  • looks for some script-like things CSS-like URLs in body content that seem to be SVG-specific

We could clean that up by splitting out the SVG checks, or applying the full IE-security-check only for types that won't get overridden back to their native type (JPEG might not actually be exploitable when served correctly, as noted, while PNG may still be )

But I think the most important change to make is:

  • remove <a href, <pre, <img, <title, and <table from the detection list

This would allow the most common cases where JPEG or PNG metadata contains an <a href...> link or such even if it's within the area scanned.

Possible security effects:

  • IE7-on-Vista or IE6/IE7-with-SSL-MITM-gateway might be able to misparse a PNG file with one of those trigger strings and then interpret scripts that appear later in the file, when served directly.

Recommended mitigation for production sites (wise even if we change nothing):

  • Serve all files from upload.wikimedia.org with X-Content-Options: nosniff header
  • Serve all uploaded files from an actually separate domain, not just a subdomain, to minimize cookie attack impact

I'll whip up a quick gerrit patch for testing.

Change 487527 had a related patch set uploaded (by Brion VIBBER; owner: Brion VIBBER):
[mediawiki/core@master] Relax HTML sniffing checks on image upload

https://gerrit.wikimedia.org/r/487527

brion added a comment.Feb 1 2019, 7:56 PM

The above patch https://gerrit.wikimedia.org/r/487527 removes some of the non-scripting tags from the checks in UploadBase::detectScript, and makes the conservative/exact IE heuristics called from UploadBase::verifyMimeType optional (but still on by default). This also deprecates $wgAllowTitleInSVG since <title is no longer looked for.

I think this should get JPEGs and such with links in their EXIF metadata working; are there some samples of files that were tripping up before that we can use to add in test cases? A test file I made locally with Gimp works but also already worked with the existing code because the EXIF data is after the area that got searched.

AlexisJazz added a comment.EditedFeb 1 2019, 10:18 PM

Though this is not needed to remove this MIME sniffing check (or at the very least, make it optional), as I've shown it can't be exploited unless the webserver is misconfigured. So it doesn't apply to Wikimedia projects.

Can you clarify what you mean by "misconfigured"? No special configuration was required with exploit PNGs when I first tested this in 2005. ISTR that JPEG and GIF were not vulnerable when served with their native Content-Types because they have a different priority than PNG in the type checking.
The original 'naughty.png' sample from my 2005 testing is now available at https://brionv.com/misc/naughty.png -- note the way I embedded the HTML using a custom ancillary chunk might no longer be considered legit as some apps reject it now, but it could likely be retooled using a standard comment chunk. Breakdown of it is like this:


IE 8 on Windows 7 doesn't seem to be vulnerable and displays the image without running the script, but I still don't have an IE 6 or 7 test environment handy.

I loaded the "naughty.png" on IE6 served as image/png. I see a fox. Perhaps your webserver was misconfigured in 2005? Or this was adjusted in a service pack or something, my test thingie has Windows XP SP3. Maybe that's too new! If I rename the file to naughty.jpg (or anything else that's wrong), I get a popup and it tries to show me my cookies! (but I have none)

I also installed Safari 3.0 beta. Now I see a tiny blue box with a question mark. Nothing else appears to happen. ietest64-jpgsig.jpg shows the example image. ietest-htmlsig.jpg shows the tiny blue box again. Safari 4.0.5, same results. More bloatware with this version. Apple software update? I don't want that crap. Bonjour? Must be from Dutch bonjouren. (to kick out) Safari 5.1.7, the last Windows version. Even more crap. Naughty.png shows..nothing. White page. ietest64-jpgsig.jpg is the same, shows the example image. ietest-htmlsig.jpg shows nothing. White page. As expected really, this wasn't expected to work in Safari without a misconfigured webserver.

AlexisJazz added a comment.EditedFeb 1 2019, 10:37 PM

The above patch https://gerrit.wikimedia.org/r/487527 removes some of the non-scripting tags from the checks in UploadBase::detectScript, and makes the conservative/exact IE heuristics called from UploadBase::verifyMimeType optional (but still on by default). This also deprecates $wgAllowTitleInSVG since <title is no longer looked for.
I think this should get JPEGs and such with links in their EXIF metadata working; are there some samples of files that were tripping up before that we can use to add in test cases? A test file I made locally with Gimp works but also already worked with the existing code because the EXIF data is after the area that got searched.

https://www.flickr.com/photos/tinto/30943950124/ and other photos from this Flickr user. (They can also be uploaded to Commons. We want them.)

brion added a comment.Feb 1 2019, 10:52 PM

https://www.flickr.com/photos/tinto/30943950124/ and other photos from this Flickr user.

Perfect! Confirmed that the original file there fails to upload on test.wikipedia.org but uploads intact on my local instance with the patch. :D

I loaded the "naughty.png" on IE6 served as image/png. I see a fox. Perhaps your webserver was misconfigured in 2005?

I assure you I tested quite thoroughly at the time, and several people have independently discovered and reported the same vulnerability over the years. Here's another description, which includes an independently written sample PNG-containing-HTML exploit as well: https://forums.hak5.org/topic/6565-xss-exploit-in-ie-by-design-says-microsoft/

In any case, we can put this part on the back burner since keeping the exact reverse-engineered IE security checks doesn't seem to block JPEG files with exif data the same way the more expansive checks did... I'll do a more thorough investigation on removing them later on once I've got local XP & Vista instances to test with. Sounds like there may be more inconsistencies in terms of later changes in service packs that makes it hard to confirm in testing!

AlexisJazz added a comment.EditedFeb 7 2019, 11:06 PM

https://www.flickr.com/photos/tinto/30943950124/ and other photos from this Flickr user.

Perfect! Confirmed that the original file there fails to upload on test.wikipedia.org but uploads intact on my local instance with the patch. :D

I loaded the "naughty.png" on IE6 served as image/png. I see a fox. Perhaps your webserver was misconfigured in 2005?

I assure you I tested quite thoroughly at the time, and several people have independently discovered and reported the same vulnerability over the years. Here's another description, which includes an independently written sample PNG-containing-HTML exploit as well: https://forums.hak5.org/topic/6565-xss-exploit-in-ie-by-design-says-microsoft/
In any case, we can put this part on the back burner since keeping the exact reverse-engineered IE security checks doesn't seem to block JPEG files with exif data the same way the more expansive checks did... I'll do a more thorough investigation on removing them later on once I've got local XP & Vista instances to test with. Sounds like there may be more inconsistencies in terms of later changes in service packs that makes it hard to confirm in testing!

If you need more test files: https://www.flickr.com/photos/usforestservice/43004618021/.

Users currently upload annoying tiny thumbnails to evade the filter.

https://www.flickr.com/photos/usforestservice/28178973997/ is beautiful, I hope we'll be able to upload it to Commons soon.

brion added a comment.Jun 4 2019, 5:25 PM

Looks like the patch never got reviewed. I'll poke some people and see if we can get it moving.

Change 487527 had a related patch set uploaded (by Brion VIBBER; owner: Brion VIBBER):
[mediawiki/core@master] Relax HTML sniffing checks on image upload

https://gerrit.wikimedia.org/r/487527

I'll do a more thorough investigation on removing them later on once I've got local XP & Vista instances to test with. Sounds like there may be more inconsistencies in terms of later changes in service packs that makes it hard to confirm in testing!

The Naughty.png exploit works on Windows 2000 SP4 with IE5. (I can't be arsed to install IE6 on it) So does ietest-htmlsig.jpg. ietest64-jpgsig.jpg does not work however, so indeed there may be a difference between jpg and png. This confirms that even Windows XP isn't vulnerable, provided you have SP3. (earlier service packs not tested)

Change 487527 merged by jenkins-bot:
[mediawiki/core@master] Relax HTML sniffing checks on image upload

https://gerrit.wikimedia.org/r/487527

Jdforrester-WMF closed this task as Resolved.Jun 7 2019, 9:56 PM
Jdforrester-WMF removed a project: Patch-For-Review.