Page MenuHomePhabricator

Encoded CSS comment start token allows bypassing SVG security checks
Closed, ResolvedPublicSecurity

Description

Sanitizer::normalizeCss() removes CSS comments from its input. This happens after certain escape sequences are decoded and after certain characters are replaced with others, making it possible to hide CSS code from UploadBase::checkCssFragment().

Some examples:

<svg xmlns="http://www.w3.org/2000/svg"> <rect width="100" height="100" style="&amp;#47;*;background-image:url(https://www.google.com/images/srpr/logo11w.png)"/> </svg>
<svg xmlns="http://www.w3.org/2000/svg"> <rect width="100" height="100" style="\/*;background-image:url(https://www.google.com/images/srpr/logo11w.png)"/> </svg>
<svg xmlns="http://www.w3.org/2000/svg"> <rect width="100" height="100" style="\2f *;background-image:url(https://www.google.com/images/srpr/logo11w.png)"/> </svg>
<svg xmlns="http://www.w3.org/2000/svg"> <rect width="100" height="100" style="/*;background-image:url(https://www.google.com/images/srpr/logo11w.png)"/> </svg>

Details

Risk Rating
Medium
Author Affiliation
Wikimedia Communities
Related Changes in Gerrit:

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
PleaseStand changed Security from none to Software security bug.Dec 21 2014, 5:22 AM
Restricted Application changed the visibility from "Public (No Login Required)" to "Custom Policy". · View Herald TranscriptDec 21 2014, 5:22 AM
Restricted Application changed the edit policy from "All Users" to "Custom Policy". · View Herald Transcript

Here's my analysis of what's going on here. In general, Sanitizer::normalizeCss() isn't necessarily telling you that the input CSS is safe, it's returning new CSS that more-or-less resembles the input CSS and is known to be safe.

UploadBase using Sanitizer::normalizeCss() like this is probably the wrong thing to do unless it's going to mangle the uploaded data. The thing to do instead might well be to fail the check if Sanitizer::normalizeCss( $value ) is different from $value passed through just the comment-stripping bit.

<svg xmlns="http://www.w3.org/2000/svg"> <rect width="100" height="100" style="&amp;#47;*;background-image:url(https://www.google.com/images/srpr/logo11w.png)"/> </svg>

The problem here seems to be that Sanitizer::normalizeCss() expects that something is running a regex like /style="([^"]*)"/ and then feeding it $1, while the callers are feeding it attribute-values gotten from an actual parser. We could probably just kill the unescaping in normalizeCss(), but I may be overlooking something.

<svg xmlns="http://www.w3.org/2000/svg"> <rect width="100" height="100" style="\/*;background-image:url(https://www.google.com/images/srpr/logo11w.png)"/> </svg>
<svg xmlns="http://www.w3.org/2000/svg"> <rect width="100" height="100" style="\2f *;background-image:url(https://www.google.com/images/srpr/logo11w.png)"/> </svg>

According to the relevant specifications, both of these have an ident-token (consisting of the "/" character) followed by a delim-token. But it may be that IE or some other broken client is interpreting them as comment-start and thus leading to T25687 or T30450, so just fixing normalizeCss to interpret things per the specifications may not be the right thing to do.

<svg xmlns="http://www.w3.org/2000/svg"> <rect width="100" height="100" style="/*;background-image:url(https://www.google.com/images/srpr/logo11w.png)"/> </svg>

This is entirely due to IE6 stupidity, treating fullwidth slash like a normal slash.

Restricted Application changed the visibility from "Custom Policy" to "Custom Policy". · View Herald TranscriptDec 22 2014, 8:32 PM
Restricted Application changed the edit policy from "Custom Policy" to "Custom Policy". · View Herald Transcript
csteipp triaged this task as Medium priority.Jan 9 2015, 1:34 AM
csteipp added subscribers: tstarling, csteipp.

Another awesome find by PleaseStand, thanks!

We need the normalization for the rest of the regexes we do, but when it's normalizing different from the actual parsing like this, that's definitely going to cause problems. I'm tempted to write a css parser that would parse and normalize the css, like I was working on for bug 71394, but the css spec looks pretty bad. I'll reread it and see if I can come up with a better normalization.

@tstarling was looking into css parser/sanitizers in T989. Might be able to leverage that here.

We've also talked about actually saving a normalized/sanitized version of the SVG after warning the user as part of the upload process that we would be modifying their work to keep our security requirements. In that case we could save the normalized css. But the legal implications of that make me a little nervous.

Restricted Application changed the visibility from "Custom Policy" to "Custom Policy". · View Herald TranscriptJan 9 2015, 1:34 AM
Restricted Application changed the edit policy from "Custom Policy" to "Custom Policy". · View Herald Transcript

Here's my analysis of what's going on here. In general, Sanitizer::normalizeCss() isn't necessarily telling you that the input CSS is safe, it's returning new CSS that more-or-less resembles the input CSS and is known to be safe.

Yes, that is basically correct. It is a workaround for broken clients -- we can't really tell what broken clients are going to do with technically correct but weird CSS, not without a lot of disassembly anyway. So we have the choice between rejecting such suspicious CSS, or converting it to a subset of CSS which is equivalent to the input but which is unlikely to be misparsed. We took the latter approach with Sanitizer::checkCss().

I recall weighing these two options for fe5b402a38e1b9611ea30718db7f52bf95adde78, which addressed IE's misparsing of a backslash escape sequence followed by a unicode zero-width space. You could get a list of every character that IE will regard as a space, and check for that list, but the easier option was to re-encode the escape sequence so as to avoid IE's space matching code entirely.

I probably should have spotted this bug when I reviewed the patch on T71008, although note that before T71008 we had no CSS validation at all in CSS, which was a bit ridiculous.

We don't embed user-supplied SVGs anywhere, do we?

I don't think we embed them anywhere, at least not from core. But we do serve them directly from the links on the File pages in a way that browsers will render them. And there's T5593 and Gerrit change 182332 asking to enable it.

We don't embed user-supplied SVGs anywhere, do we?

Not uploaded svgs, but as Brad mentioned there's been suggestions that we implement it. We are embedding ones rendered by the math extension. Also, third party users almost always serve off the primary domain.

So I'd like to get our filtering to the point where we could serve it off a content domain, although I'd strongly oppose the WMF ever doing that.

Well not a fix for this issue, maybe as a defense-in-depth measure for if the blacklist fails [like in this bug], we should send a Content-Security-Policy header for all content on upload.wikimedia.org

Something like:

Content-Security-Policy: default-src 'none'; style-src 'unsafe-inline' data:; font-src data:; img-src data:; media-src data:; sandbox

This would block loading external images/resources, and running scripts (for browsers which support it), which I believe are our main security goals

I thought we had a bug open for adding CSP to upload, but maybe not. I think it's something we should do.

So now that the css-sanitizer library exists ( https://www.mediawiki.org/wiki/Css-sanitizer ) , we want to replace our custom css validation with that

As an aside, this also applies to <style> tags as UploadBase does self::checkCssFragment( Sanitizer::normalizeCss( $data )

Its not just removing the CSS comments, this can be tricked in a lot of ways: Some other PoCs:

Using a comment inside a string:

<?xml version="1.0" encoding="UTF-8"?>
<svg viewbox="0 0 220 120" xmlns="http://www.w3.org/2000/svg">
<style>
@charset "/*";
@import "https://example.com/evil.css";
</style>
<text x="0" y="15">Evil Test file</text>
</svg>

Unclosed URLs aren't parsed properly:

<?xml version="1.0" encoding="UTF-8"?>
<svg viewbox="0 0 220 120" xmlns="http://www.w3.org/2000/svg">
<style>
text {
	mask-image: url( '#foo
;
	mask-image: url( 'https://example.com/evilmask.png' )
}
</style>
<text x="0" y="15">Evil Test file</text>
</svg>

Both of these can be uploaded, and both load external resources in firefox.

Bawolff added a subscriber: sbassett.

I plan to work on this.

@sbassett do you want patches on the task or should they go through gerrit? (Risk is somewhat on the medium-low side as it only affects direct views of original svg files, which is an unusual action)

In MediaViewer, if you click on the image again after the lightbox has opened, it will navigate to the file directly. It's easy to do by accident, or to get a zoomable version (MediaViewer breaks normal zoom). So I don't think it's that unusual.

I plan to work on this.

Awesome.

@sbassett do you want patches on the task or should they go through gerrit? (Risk is somewhat on the medium-low side as it only affects direct views of original svg files, which is an unusual action)

Can you post them here to start out? Or at least some general WIP patches? From there we could maybe do a well-timed, public merge in gerrit.

Another bypass

<svg xmlns="http://www.w3.org/2000/svg">
 <text x="0" y="20" cursor="url(http://example.com/foo.png), pointer">here</text>
</svg>

F66084702 [Superceeded by v2]
Patch ready for review

It would probably make sense to download the last 500 or so SVG files, to check if any would be blocked by the new version in case its blocking something it shouldn't. I haven't done that yet, but I'll look into doing so.

So i tried to test this out on current files on commons. I took 20000 most used SVG files (from https://bawolff.toolforge.org/mostpopularsvg.tsv ) and 10000 recently uploaded SVGs (from https://bawolff.toolforge.org/recentsvg.tsv Uploads between Aug 22 to sept 2 ).

I used the following hacky script piped into php maintenance/eval.php

ini_set( 'user_agent', 'SVGCheck (https://mediawiki.org/wiki/User_talk:Bawolff) T85085' );

function processFile( $name ) { $url = 'https://upload.wikimedia.org/wikipedia/commons/';$md5 = md5( $name );	$url .= $md5[0] . '/' . $md5[0] . $md5[1] . '/' . rawurlencode( $name );	file_put_contents( '/run/user/1000/tmp.svg', file_get_contents( $url ) );	$uv = MediaWiki\MediaWikiServices::getInstance()->getUploadVerification(); $r = $uv->verifyFile( '/run/user/1000/tmp.svg', 'svg', [ 'mime' => 'image/svg+xml', 'file-mime' => 'image/svg+xml' ] ); if ( $r !== true ) var_dump( $name, $r ); }

processFile( 'Commons-logod.svg' );
$fp = fopen( 'mostpopularsvg.tsv', 'r' );
$line = fgets( $fp );
$count = 0;
while (!feof( $fp ) && $line = fgets( $fp ) ) { processFile( substr( $line, 0, -1 ) ); $count++; if ( $count % 50 === 0 ) echo '*';  }

echo "\nprocessed $count";

Results

The following files failed under the new patch but were ok before:

  • Style attribute that is invalid CSS triggering a parse error (These seem largely like weird typos in style attribute)
    • Libertarianism-groups-diagram.svg Coat_of_arms_of_Udmurtia.svg BSicon_num1r.svg BSicon_num2l.svg BSicon_num2r.svg BSicon_num1l.svg BSicon_num3r.svg BSicon_num3l.svg CAAC_Northeast_China.svg CAAC_Northwest_China.svg CAAC_Xinjiang.svg BSicon_num4l.svg BSicon_num4r.svg Map_Congo_crisis_1961.svg Map_Congo_crisis_1964.svg Map_Congo_crisis_1961.svg Map_Congo_crisis_1964.svg Тяжёлая_танковая_бригада_РККА,_1945.svg
  • style tag invalid. (The NIH_BioArt images all have a CSS rule like font-variation-settings: '; where there is an unclosed string
    • NIH_BioArt_2d-ftu-lung-bronchial-submucosal-gland.svg NIH_BioArt_2d-ftu-kidney-descending-thin-loop-of-henle.svg NIH_BioArt_2d-ftu-pancreas-intercalated-duct.svg NIH_BioArt_2d-ftu-mouth-submandibular-gland.svg NIH_BioArt_2d-ftu-kidney-nephron.svg NIH_BioArt_2d-ftu-mouth-parotid-gland.svg NIH_BioArt_2d-ftu-prostate-prostate-glandular-acinus.svg NIH_BioArt_2d-ftu-spleen-red-pulp.svg NIH_BioArt_2d-ftu-kidney-renal-corpuscle.svg NIH_BioArt_2d-ftu-spleen-white-pulp.svg

Additionally the following older files failed but as far as i can tell have been banned from upload for about a decade at this point:

  • Bad DTD (typically this is non-standard locations for the SVG dtd url)
    • Pictogram_voting_info.svg Pictogram_voting_comment.svg Wikiversity-logo.svg Pictogram_voting_keep-light-green.svg Constellations_ecliptic_equirectangular_plot_no_name-fr.svg
  • Contains XMP data with a non-whitelisted namespace (T405154). Typically http://iptc.org/std/iptc4xmpcore/1.0/xmlns/
    • India_Gujarat_locator_map.svg India_Uttar_Pradesh_locator_map.svg India_Punjab_locator_map.svg India_Bihar_locator_map.svg Bihar_locator_map.svg India_Haryana_locator_map.svg West_Bengal_outline_map.svg India_Rajasthan_locator_map.svg Kerala_locator_map.svg India_Kerala_locator_map.svg India_Orissa_locator_map.svg India_Madhya_Pradesh_locator_map.svg India_West_Bengal_locator_map.svg India_Maharashtra_locator_map.svg India_Jharkhand_locator_map.svg Himachal_Pradesh_locator_map.svg
  • Contains a colour profile linked to windows system colour profile (e.g. c:\windows\system32\spool\drivers\color\srgb color space profile.icm ). I guess really old inkscape did this
    • Chad_location_map.svg Pays_de_la_Loire_region_location_map.svg Pas-de-Calais_department_location_map.svg Mauritania_location_map.svg Africa_location_map.svg Polynésie_française_collectivity_location_map.svg Seine-Maritime_department_location_map.svg Andorra_location_map.svg Orne_department_location_map.svg Eure_department_location_map.svg Haute-Normandie_region_location_map.svg Sarthe_department_location_map.svg Somalia_location_map.svg Mayenne_department_location_map.svg Vendée_department_location_map.svg Tonga_location_map.svg Haute-Corse_department_location_map.svg Canada_Quebec_location_map-conic_proj.svg Scotland_land_cover_map-sco.svg
  • Contains an external raster image file (Often this is a file: url or otherwise not network accessible file)
    • Outline_Map_of_Orlovskaya_Oblast.svg Gnome-emblem-web.svg Library-logo.svg Gnome-timezone.svg Billard_Picto_2-black.svg Logo_site_naturel_negatif.svg Cetus_constellation_map-bs.svg
  • Contains an embedded raster image where the data url doesn't have a mime type
    • Caoandanielinbluedoyouknow.svg Bandera_antigua_de_la_provincia_de_Toledo.svg
  • XML parse error
    • Obcine_Slovenija_2006_Krsko.svg (not sure what is going on, file seems valid)

Conclusion:

I think for the invalid <style> tag, the NIH images are an uncommon one-off typo, where it would be fine to block them and just expect people to fix their SVGs. The images with an invalid style attribute seem a bit more common. I think for those, it might make sense to use the older style attribute sanitizer, and only fail it if the sanitizer tries to change the normalization. Most of the examples I saw that were failing for a style attribute did not use url() or comments in the attribute.

Other than that, I think this analysis shows that the change has minimal false positives. If I make the change I'm proposing above with regards to the style attribute, then the only files (out of 30,000 tested) that would be newly blocked would be the NIH ones that were all from the same source. I think that's a reasonable trade off

New version of patch:

[I subscribed some people to this task who have previously worked on multimedia related things]

After change, the only new files it rejects out of the 30,000 tested.

  • The NIH files mentioned above with the invalid <style> tag
  • Тяжёлая_танковая_бригада_РККА,_1945.svg - has invalid fill attribute fill="rgb(0,0,0,0.3" which confuses it. If this is a big issue we could relax the checks on presentational attributes further similar to what we did with style, but i think this is a rare typo that is reasonable to reject.
  • Libertarianism-groups-diagram.svg - has a style like fill:url(#gneolib);stroke:#a7be62;stroke-dasharray:2.99999, 2.99999;g; which is both invalid due to the ;g; and also has a url() in it.
Bawolff changed the edit policy from "Custom Policy" to "Custom Policy".

@Bawolff impact seems acceptable to me. no concerns from my end.

Data: urls are also banned except in @font-face. The rationale behind this is unclear.

TIL... I'm guessing it is to make use of fonts and preserve rendering, which isn't otherwise possible with text, but as we do not allow non-free fonts, the usage seems pretty limited. It's also not documented anywhere else as far as I could find.

@Bawolff impact seems acceptable to me. no concerns from my end.

Data: urls are also banned except in @font-face. The rationale behind this is unclear.

TIL... I'm guessing it is to make use of fonts and preserve rendering, which isn't otherwise possible with text, but as we do not allow non-free fonts, the usage seems pretty limited. It's also not documented anywhere else as far as I could find.

After i wrote my original comment, I did more digging and found T71008#717580 so i guess its about a specific tool. Note how the data: url is missing mime type, so in addition to being undocumented you also have to use mildly incorrect syntax. I suppose the mime type is just a suggestion.

That said, I personally think its reasonable to allow embedded fonts.

@sbassett What are next steps here?

Sorry, have been busy with FY2025-26 WE4.6.2 Multiple Authenticators this quarter. And we're not doing the AppSec clinic every Monday anymore - every third week will be a sort of async scrum, which was this week :) Anyhow, your patch from T85085#11199941 has enough going on that I'd like to get some more CR from parser/css-sanitizer folks before we security-deploy it. Tagging some folks and will let @SLopes-WMF know.

sbassett changed the subtype of this task from "Task" to "Security Issue".Oct 2 2025, 3:25 PM
sbassett added a subscriber: SLopes-WMF.
sbassett changed the edit policy from "Custom Policy" to "Custom Policy".
sbassett changed Risk Rating from N/A to Medium.
sbassett added a subscriber: ihurbain.

The patch in T85085 looks reasonable to me. I'd recommend applying it, getting the security release done, and then we can do any further fine tuning in public. The testing @Bawolff has done seems pretty solid, thanks a lot for that.

So I'm thinking that since this is introducing an entirely new SVGCSSChecker class that we might want to push this through gerrit for more public review, increased visibility and test coverage via CI. Since the patch is labeled as a code-hardening effort, I feel like the risk of someone being able to actively exploit any implied issues here is likely low. Thoughts? Objections?

Today would be a good day to put it into gerrit, since if we do it in the next few hours we could get the patch merged to master and on this week's train and also backported to the current active branch. That would reduce the window between publication and closing the hole to a few hours, which seems reasonable for a low-severity issue.

Today would be a good day to put it into gerrit, since if we do it in the next few hours we could get the patch merged to master and on this week's train and also backported to the current active branch. That would reduce the window between publication and closing the hole to a few hours, which seems reasonable for a low-severity issue.

I think I'm still fine with that. Do you want me to push up the patch?

Change #1193927 had a related patch set uploaded (by SBassett; author: Brian Wolff):

[mediawiki/core@master] Improve css checking in SVG filter.

https://gerrit.wikimedia.org/r/1193927

Change #1193927 merged by jenkins-bot:

[mediawiki/core@master] Improve css checking in SVG filter.

https://gerrit.wikimedia.org/r/1193927

Calling this done for now with the merger of @Bawolff's patch above.

Should i mention this to commons users to be on the look out for issues (keeping details vauge) or do we want to keep this on the down-low?

Should i mention this to commons users to be on the look out for issues (keeping details vauge) or do we want to keep this on the down-low?

The patch was public and it should go out with this week's train, so I think it's fine to talk about it. Especially in the context of potential bug-reporting or unexpected behavior.

Should i mention this to commons users to be on the look out for issues (keeping details vauge) or do we want to keep this on the down-low?

I think it's fine to generally discuss it once it rolls out to 1.45.0-wmf.23.

Change #1196688 had a related patch set uploaded (by Reedy; author: Brian Wolff):

[mediawiki/core@REL1_44] Improve css checking in SVG filter.

https://gerrit.wikimedia.org/r/1196688

Change #1196688 abandoned by Reedy:

[mediawiki/core@REL1_44] Improve css checking in SVG filter.

https://gerrit.wikimedia.org/r/1196688

Should i mention this to commons users to be on the look out for issues (keeping details vauge) or do we want to keep this on the down-low?

I think it's fine to generally discuss it once it rolls out to 1.45.0-wmf.23.

https://commons.wikimedia.org/wiki/Commons:Village_pump/Technical#c-Bawolff-20251016174000-Update_to_the_SVG_filter

Change #1196688 restored by Brian Wolff:

[mediawiki/core@REL1_44] Improve css checking in SVG filter.

https://gerrit.wikimedia.org/r/1196688

I made backports of this patch to 1.44 and 1.43. The backports are non trivial enough I'm not sure if other people want to review them, instead of the usual self-review +2 for backports.

https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1196741
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1196688

Change #1196741 had a related patch set uploaded (by Brian Wolff; author: Brian Wolff):

[mediawiki/core@REL1_43] Improve CSS checking in SVG filter

https://gerrit.wikimedia.org/r/1196741

Change #1196688 had a related patch set uploaded (by Brian Wolff; author: Brian Wolff):

[mediawiki/core@REL1_44] Improve CSS checking in SVG filter

https://gerrit.wikimedia.org/r/1196688

Change #1196688 merged by jenkins-bot:

[mediawiki/core@REL1_44] Improve CSS checking in SVG filter

https://gerrit.wikimedia.org/r/1196688

Change #1196741 merged by jenkins-bot:

[mediawiki/core@REL1_43] Improve CSS checking in SVG filter

https://gerrit.wikimedia.org/r/1196741

Do we want to make the ticket public now, considering the change did ride the train in public ?

sbassett changed Author Affiliation from N/A to Wikimedia Communities.Oct 29 2025, 4:57 PM
sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".
sbassett changed the edit policy from "Custom Policy" to "All Users".
sbassett removed a project: Patch-For-Review.
sbassett moved this task from Watching to Our Part Is Done on the Security-Team board.

Do we want to make the ticket public now, considering the change did ride the train in public ?

Yes, done.