Page MenuHomePhabricator

Optimize SVGs in data URIs
Closed, ResolvedPublic

Description

Encoding SVG in data: URI as data:image/svg+xml,… (and embedding it in CSS with CSSMin) is what we're currently doing as it's more efficient for text.

An article on optimizing SVGs in Data URIs for smaller output states several methods, which we haven't been taken advantage of so far:

  • Mangling with whitespace where it doesn't break URI encoding
  • Removing XML declaration, which isn't needed as CSS background-image
  • Replacing attribute quotes from double " to single ' ones.

Exemplified on close icon:
Original file:

<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 20 20">
	<path d="M3.636 2.222l14.142 14.142-1.414 1.414L2.222 3.636z"/>
	<path d="M17.778 3.636L3.636 17.778l-1.414-1.414L16.364 2.222z"/>
</svg>

Old method with normal URI encoding currently in use:

background-image: url("data:image/svg+xml,%3C%3Fxml%20version%3D%221.0%22%20encoding%3D%22utf-8%22%3F%3E%0D%0A%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%20width%3D%2220%22%20height%3D%2220%22%20viewBox%3D%220%200%2020%2020%22%3E%0D%0A%09%3Cpath%20d%3D%22M3.636%202.222l14.142%2014.142-1.414%201.414L2.222%203.636z%22%2F%3E%0D%0A%09%3Cpath%20d%3D%22M17.778%203.636L3.636%2017.778l-1.414-1.414L16.364%202.222z%22%2F%3E%0D%0A%3C%2Fsvg%3E%)A");

Compression ratio after gzipping: 168 % Original size: 455 bytes Result size: 271 bytes

Alternative optimized method:

background-image: url("data:image/svg+xml,%3C?xml version='1.0' encoding='utf-8'?%3E %3Csvg xmlns='http://www.w3.org/2000/svg' width='20' height='20' viewBox='0 0 20 20'%3E %3Cpath d='M3.636 2.222l14.142 14.142-1.414 1.414L2.222 3.636z'/%3E %3Cpath d='M17.778 3.636L3.636 17.778l-1.414-1.414L16.364 2.222z'/%3E %3C/svg%3E");

Compression ratio: 137 % Original size: 325 bytes Result size: 237 bytes

Further optimized (stripping XML declaration not needed for background-images):

background-image: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' width='20' height='20' viewBox='0 0 20 20'%3E %3Cpath d='M3.636 2.222l14.142 14.142-1.414 1.414L2.222 3.636z'/%3E %3Cpath d='M17.778 3.636L3.636 17.778l-1.414-1.414L16.364 2.222z'/%3E %3C/svg%3E");

Compression ratio: 137 % Original size: 281 bytes Result size: 205 bytes

From data over the wire perspective, it's well-worth exploring this option, a nice side effect would be to having easier readable SVG data URIs.
Luckily, RL's CSSMin.php give us a central place where all SVGs via @embed are going through.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

This proposed optimization is similar to something I implemented in Parsoid's HTML5 serializer. In that case, we switch between single & double quotes for HTML attributes depending on whether the attribute value contains more single quotes or double quotes. This had a very significant impact on Parsoid HTML size, mainly because it has many JSON values embedded in attributes.

Speaking of SVG optimization in general, we previously looked into this for math SVGs: T74547. Not sure if those more path focused optimizations are already applied across the board.

@GWicke We're already optimizing SVGs with SVGO in OOUI. MobileFrontend goes even a step further requiring SVGO optimization per script before Jenkins accepts it. But that's IMHO a different conversation.

Regarding the implementation, above article also features following JS function:

function encodeOptimizedSVGDataUri(svgString) {
  var uriPayload = encodeURIComponent(svgString) // encode URL-unsafe characters
    .replace(/%0A/g, '') // remove newlines
    .replace(/%20/g, ' ') // put spaces back in
    .replace(/%3D/g, '=') // ditto equals signs
    .replace(/%3A/g, ':') // ditto colons
    .replace(/%2F/g, '/') // ditto slashes
    .replace(/%22/g, "'"); // replace quotes with apostrophes (may break certain SVGs)

  return 'data:image/svg+xml,' + uriPayload;
}

// Possible improvements:
//   * Lowercase the hex-escapes for better gzipping
//   * Replace stuff like `fill="%23000"` with `fill="black"`

There's even a SASS alternative, which could probably be ported to LESS. My initial thought is MediaWiki-ResourceLoader as the place for such optimization though.
Given that on a small and simple icon like the close icon above, we decrease ~12% of size after gzipping, I think the savings could be significant enough to proceed.

Krinkle moved this task from Inbox to Backlog on the MediaWiki-ResourceLoader board.

I'm not sure why we want to replace the quotes with apostrophes inside the SVG file (which, as noted, "may break certain SVGs") when we could change the wrapper from quote to an apostrophe? Like this:

background-image: url('data:image/svg+xml,%3C?xml version="1.0" encoding="utf-8"?%3E %3Csvg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 20 20"%3E %3Cpath d="M3.636 2.222l14.142 14.142-1.414 1.414L2.222 3.636z"/%3E %3Cpath d="M17.778 3.636L3.636 17.778l-1.414-1.414L16.364 2.222z"/%3E %3C/svg%3E ');

@matmarex As the article and furthermore RFC 3986 in Appendix D.2 states :

but only ' is allowed in a URL.

Change 377820 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] Improve encoding of embedded SVGs

https://gerrit.wikimedia.org/r/377820

Change 377828 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] Improve encoding of quotes in embedded SVGs

https://gerrit.wikimedia.org/r/377828

Hello, I was asked by @Volker_E on Twitter for examples of SVGs that break when " is converted into '. They’re minor edge-cases so I didn't dilute the article with them, but it’s a fair concern. The good news: there’s only two.

  1. Attributes that contain ' in their values. In SVG, the only attributes I know that can are ones that can take URLs, and aria-label. The former is unlikely (I was surprised to learn they’re even legal in URLs!), and the latter shouldn’t happen inside CSS-only non-content SVGs.
  1. The SVG uses <text> containing an apostrophe somewhere. Also unlikely in CSS decorative SVG, and curly quotes are better typography-wise anyway.

Hope that helps!

@matmarex As the article and furthermore RFC 3986 in Appendix D.2 states :

but only ' is allowed in a URL.

If you want to follow the RFC closely, then spaces must also be URL-encoded. But we want to unencode them so clearly we're disregarding the standards and doing what works ;) I don't see why spaces would be okay to unencode but quotes not.

@matmarex As the article and furthermore RFC 3986 in Appendix D.2 states :

but only ' is allowed in a URL.

If you want to follow the RFC closely, then spaces must also be URL-encoded. But we want to unencode them so clearly we're disregarding the standards and doing what works ;) I don't see why spaces would be okay to unencode but quotes not.

The patch I submitted does what you (@matmarex) suggested to start with: unencode double quotes, wrap the URL in single quotes.

Unencoded " broke older versions of Android in my testing, mostly. Maybe they’re no longer common enough to worry about.

Here's a heavy-load test case not even close to reality as we currently don't have anything bigger than SVGs under few KBs (haven't found anything above 2 KBs within short search) as data: URIs:
https://codepen.io/Volker_E/full/BwyQwo/

All test images (a.o. featuring the English Wikipedia logo with over 140 KBs) are loaded in IE 9/10/Win 7, IE 11/Win 8.1, Saf 5.1/Win 7, Nexus 7/Android 4.4 before and after:

image.png (738×1 px, 333 KB)

Example in IE 9

Change 378075 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] Mangle whitespace in embedded SVGs

https://gerrit.wikimedia.org/r/378075

Unencoded " broke older versions of Android in my testing, mostly. Maybe they’re no longer common enough to worry about.

We also have non-embedded fallbacks for browsers that don't support data URIs, so if our encoding (or lack thereof) breaks in some older or obscure browsers, that's not a big problem: they'll just fall back to loading the image over an https URL. (The fallback image is also a PNG instead of an SVG in most cases.) So they get a worse and less performant image, but it won't break.

Change 377820 merged by jenkins-bot:
[mediawiki/core@master] Improve encoding of embedded SVGs

https://gerrit.wikimedia.org/r/377820

Change 377828 merged by jenkins-bot:
[mediawiki/core@master] CSSMin: Improve encoding of quotes in embedded SVGs

https://gerrit.wikimedia.org/r/377828

Change 378075 merged by jenkins-bot:
[mediawiki/core@master] CSSMin: Mangle whitespace in embedded SVGs

https://gerrit.wikimedia.org/r/378075

Volker_E assigned this task to Catrope.
Volker_E removed a project: Patch-For-Review.
Volker_E removed a subscriber: gerritbot.

Change 377828 merged by jenkins-bot:
[mediawiki/core@master] CSSMin: Improve encoding of quotes in embedded SVGs

https://gerrit.wikimedia.org/r/377828

This was reverted in https://gerrit.wikimedia.org/r/381430 (bd370e658f87c9e3b574b44252ce7d13f1ea3633) due to T176884: Icons missing throughout UI on Edge, IE 11.

Change 378075 merged by jenkins-bot:
[mediawiki/core@master] CSSMin: Mangle whitespace in embedded SVGs

https://gerrit.wikimedia.org/r/378075

This was also reverted, but not because it was broken (as far as we know), only because it needed to be reverted to make the revert of the quotes patch apply cleanly. We can try reapplying the whitespace patch, but we should do that after the cut and test carefully in IE/Edge.

Change 403092 had a related patch set uploaded (by VolkerE; owner: VolkerE):
[mediawiki/core@master] Re-introduce CSSMin: Mangle whitespace in embedded SVGs

https://gerrit.wikimedia.org/r/403092

Change 403093 had a related patch set uploaded (by VolkerE; owner: VolkerE):
[mediawiki/core@master] CSSMin: Remove XML declaration from SVGs

https://gerrit.wikimedia.org/r/403093

Change 403092 merged by jenkins-bot:
[mediawiki/core@master] CSSMin: Re-introduce whitespace mangling in embedded SVGs

https://gerrit.wikimedia.org/r/403092

Change 403577 had a related patch set uploaded (by VolkerE; owner: VolkerE):
[mediawiki/core@master] Re-introduce CSSMin: Improve encoding of quotes in embedded SVGs

https://gerrit.wikimedia.org/r/403577

In my testing of 403577 everything works fine with IE 10-11 & Edge, but given its UI significance it would be preferable to have QA support by @Etonkovidova this time.

T175318 SVG CSSMinifier IE11 2018-01-10.png (541×824 px, 205 KB)

@Etonkovidova Thanks, but it's too early, was meant for a heads-up. We are still going back and forth on the patches – will keep you posted.

Change 403093 merged by jenkins-bot:
[mediawiki/core@master] CSSMin: Remove XML declaration from SVGs

https://gerrit.wikimedia.org/r/403093

Going to consider this solved. We've implemented and deployed all but one of the planned optimisations. The last one is proving difficult and somewhat incompatible with free-form SVG syntax. Worth exploring later in a separate task perhaps.

Change 403577 abandoned by VolkerE:
CSSMin: Re-introduce improved quote encoding in embedded SVGs

https://gerrit.wikimedia.org/r/403577