Page MenuHomePhabricator

Optimize SVGs in data URIs
Closed, ResolvedPublic

Description

Encoding SVG in data: URI as data:image/svg+xml,… (and embedding it in CSS with CSSMin) is what we're currently doing as it's more efficient for text.

An article on optimizing SVGs in data: URIs for smaller output states several methods, which we haven't been taken advantage of so far:

  • Mangling with whitespace where it doesn't break URI encoding
  • Removing XML declaration, which isn't needed as CSS background-image
  • Replacing attribute quotes from double " to single ' ones.

Exemplified on close icon:
Original file:

<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 20 20">
	<path d="M3.636 2.222l14.142 14.142-1.414 1.414L2.222 3.636z"/>
	<path d="M17.778 3.636L3.636 17.778l-1.414-1.414L16.364 2.222z"/>
</svg>

Old method with normal URI encoding currently in use:

background-image: url("data:image/svg+xml,%3C%3Fxml%20version%3D%221.0%22%20encoding%3D%22utf-8%22%3F%3E%0D%0A%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%20width%3D%2220%22%20height%3D%2220%22%20viewBox%3D%220%200%2020%2020%22%3E%0D%0A%09%3Cpath%20d%3D%22M3.636%202.222l14.142%2014.142-1.414%201.414L2.222%203.636z%22%2F%3E%0D%0A%09%3Cpath%20d%3D%22M17.778%203.636L3.636%2017.778l-1.414-1.414L16.364%202.222z%22%2F%3E%0D%0A%3C%2Fsvg%3E%)A");

Compression ratio after gzipping: 168 % Original size: 455 bytes Result size: 271 bytes

Alternative optimized method:

background-image: url("data:image/svg+xml,%3C?xml version='1.0' encoding='utf-8'?%3E %3Csvg xmlns='http://www.w3.org/2000/svg' width='20' height='20' viewBox='0 0 20 20'%3E %3Cpath d='M3.636 2.222l14.142 14.142-1.414 1.414L2.222 3.636z'/%3E %3Cpath d='M17.778 3.636L3.636 17.778l-1.414-1.414L16.364 2.222z'/%3E %3C/svg%3E");

Compression ratio: 137 % Original size: 325 bytes Result size: 237 bytes

Further optimized (stripping XML declaration not needed for background-images):

background-image: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' width='20' height='20' viewBox='0 0 20 20'%3E %3Cpath d='M3.636 2.222l14.142 14.142-1.414 1.414L2.222 3.636z'/%3E %3Cpath d='M17.778 3.636L3.636 17.778l-1.414-1.414L16.364 2.222z'/%3E %3C/svg%3E");

Compression ratio: 137 % Original size: 281 bytes Result size: 205 bytes

From data over the wire perspective, it's well-worth exploring this option, a nice side effect would be to having easier readable SVG data URIs.
Luckily, RL's CSSMin.php give us a central place where all SVGs via @embed are going through.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 7 2017, 11:46 PM
Suhadakashter closed this task as a duplicate of T175367: Page wikipedia.
Reedy reopened this task as Open.Sep 8 2017, 2:09 PM

This proposed optimization is similar to something I implemented in Parsoid's HTML5 serializer. In that case, we switch between single & double quotes for HTML attributes depending on whether the attribute value contains more single quotes or double quotes. This had a very significant impact on Parsoid HTML size, mainly because it has many JSON values embedded in attributes.

Speaking of SVG optimization in general, we previously looked into this for math SVGs: T74547. Not sure if those more path focused optimizations are already applied across the board.

@GWicke We're already optimizing SVGs with SVGO in OOUI. MobileFrontend goes even a step further requiring SVGO optimization per script before Jenkins accepts it. But that's IMHO a different conversation.

Regarding the implementation, above article also features following JS function:

function encodeOptimizedSVGDataUri(svgString) {
  var uriPayload = encodeURIComponent(svgString) // encode URL-unsafe characters
    .replace(/%0A/g, '') // remove newlines
    .replace(/%20/g, ' ') // put spaces back in
    .replace(/%3D/g, '=') // ditto equals signs
    .replace(/%3A/g, ':') // ditto colons
    .replace(/%2F/g, '/') // ditto slashes
    .replace(/%22/g, "'"); // replace quotes with apostrophes (may break certain SVGs)

  return 'data:image/svg+xml,' + uriPayload;
}

// Possible improvements:
//   * Lowercase the hex-escapes for better gzipping
//   * Replace stuff like `fill="%23000"` with `fill="black"`

There's even a SASS alternative, which could probably be ported to LESS. My initial thought is MediaWiki-ResourceLoader as the place for such optimization though.
Given that on a small and simple icon like the close icon above, we decrease ~12% of size after gzipping, I think the savings could be significant enough to proceed.

Krinkle triaged this task as Low priority.Sep 12 2017, 9:23 AM
Krinkle moved this task from Inbox to Backlog on the MediaWiki-ResourceLoader board.
matmarex added a comment.EditedSep 13 2017, 6:23 PM

I'm not sure why we want to replace the quotes with apostrophes inside the SVG file (which, as noted, "may break certain SVGs") when we could change the wrapper from quote to an apostrophe? Like this:

background-image: url('data:image/svg+xml,%3C?xml version="1.0" encoding="utf-8"?%3E %3Csvg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 20 20"%3E %3Cpath d="M3.636 2.222l14.142 14.142-1.414 1.414L2.222 3.636z"/%3E %3Cpath d="M17.778 3.636L3.636 17.778l-1.414-1.414L16.364 2.222z"/%3E %3C/svg%3E ');
Volker_E added a comment.EditedSep 13 2017, 6:44 PM

@matmarex As the article and furthermore RFC 3986 in Appendix D.2 states :

but only ' is allowed in a URL.

Change 377820 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] Improve encoding of embedded SVGs

https://gerrit.wikimedia.org/r/377820

Change 377828 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] Improve encoding of quotes in embedded SVGs

https://gerrit.wikimedia.org/r/377828

Volker_E updated the task description. (Show Details)Sep 13 2017, 7:32 PM
Tigt added a subscriber: Tigt.Sep 13 2017, 7:37 PM

Hello, I was asked by @Volker_E on Twitter for examples of SVGs that break when " is converted into '. They’re minor edge-cases so I didn't dilute the article with them, but it’s a fair concern. The good news: there’s only two.

  1. Attributes that contain ' in their values. In SVG, the only attributes I know that can are ones that can take URLs, and aria-label. The former is unlikely (I was surprised to learn they’re even legal in URLs!), and the latter shouldn’t happen inside CSS-only non-content SVGs.
  1. The SVG uses <text> containing an apostrophe somewhere. Also unlikely in CSS decorative SVG, and curly quotes are better typography-wise anyway.

Hope that helps!

@matmarex As the article and furthermore RFC 3986 in Appendix D.2 states :

but only ' is allowed in a URL.

If you want to follow the RFC closely, then spaces must also be URL-encoded. But we want to unencode them so clearly we're disregarding the standards and doing what works ;) I don't see why spaces would be okay to unencode but quotes not.

@matmarex As the article and furthermore RFC 3986 in Appendix D.2 states :

but only ' is allowed in a URL.

If you want to follow the RFC closely, then spaces must also be URL-encoded. But we want to unencode them so clearly we're disregarding the standards and doing what works ;) I don't see why spaces would be okay to unencode but quotes not.

The patch I submitted does what you (@matmarex) suggested to start with: unencode double quotes, wrap the URL in single quotes.

Tigt added a comment.Sep 13 2017, 8:21 PM

Unencoded " broke older versions of Android in my testing, mostly. Maybe they’re no longer common enough to worry about.

Volker_E added a comment.EditedSep 13 2017, 11:21 PM

Here's a heavy-load test case not even close to reality as we currently don't have anything bigger than SVGs under few KBs (haven't found anything above 2 KBs within short search) as data: URIs:
https://codepen.io/Volker_E/full/BwyQwo/

All test images (a.o. featuring the English Wikipedia logo with over 140 KBs) are loaded in IE 9/10/Win 7, IE 11/Win 8.1, Saf 5.1/Win 7, Nexus 7/Android 4.4 before and after:


Example in IE 9

Change 378075 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] Mangle whitespace in embedded SVGs

https://gerrit.wikimedia.org/r/378075

Unencoded " broke older versions of Android in my testing, mostly. Maybe they’re no longer common enough to worry about.

We also have non-embedded fallbacks for browsers that don't support data URIs, so if our encoding (or lack thereof) breaks in some older or obscure browsers, that's not a big problem: they'll just fall back to loading the image over an https URL. (The fallback image is also a PNG instead of an SVG in most cases.) So they get a worse and less performant image, but it won't break.

Change 377820 merged by jenkins-bot:
[mediawiki/core@master] Improve encoding of embedded SVGs

https://gerrit.wikimedia.org/r/377820

Change 377828 merged by jenkins-bot:
[mediawiki/core@master] CSSMin: Improve encoding of quotes in embedded SVGs

https://gerrit.wikimedia.org/r/377828

Change 378075 merged by jenkins-bot:
[mediawiki/core@master] CSSMin: Mangle whitespace in embedded SVGs

https://gerrit.wikimedia.org/r/378075

Volker_E closed this task as Resolved.Sep 19 2017, 9:58 PM
Volker_E assigned this task to Catrope.
Volker_E removed a project: Patch-For-Review.
Volker_E removed a subscriber: gerritbot.
Jdforrester-WMF reopened this task as Open.Sep 29 2017, 5:21 PM

Change 377828 merged by jenkins-bot:
[mediawiki/core@master] CSSMin: Improve encoding of quotes in embedded SVGs
https://gerrit.wikimedia.org/r/377828

This was reverted in https://gerrit.wikimedia.org/r/381430 (bd370e658f87c9e3b574b44252ce7d13f1ea3633) due to T176884: Icons missing throughout UI on Edge, IE 11.

Change 378075 merged by jenkins-bot:
[mediawiki/core@master] CSSMin: Mangle whitespace in embedded SVGs
https://gerrit.wikimedia.org/r/378075

This was also reverted, but not because it was broken (as far as we know), only because it needed to be reverted to make the revert of the quotes patch apply cleanly. We can try reapplying the whitespace patch, but we should do that after the cut and test carefully in IE/Edge.

Volker_E updated the task description. (Show Details)Dec 19 2017, 4:09 PM

Change 403092 had a related patch set uploaded (by VolkerE; owner: VolkerE):
[mediawiki/core@master] Re-introduce CSSMin: Mangle whitespace in embedded SVGs

https://gerrit.wikimedia.org/r/403092

Change 403093 had a related patch set uploaded (by VolkerE; owner: VolkerE):
[mediawiki/core@master] CSSMin: Remove XML declaration from SVGs

https://gerrit.wikimedia.org/r/403093

Change 403092 merged by jenkins-bot:
[mediawiki/core@master] CSSMin: Re-introduce whitespace mangling in embedded SVGs

https://gerrit.wikimedia.org/r/403092

Volker_E updated the task description. (Show Details)Jan 9 2018, 6:33 PM
Volker_E updated the task description. (Show Details)Jan 11 2018, 2:35 AM

Change 403577 had a related patch set uploaded (by VolkerE; owner: VolkerE):
[mediawiki/core@master] Re-introduce CSSMin: Improve encoding of quotes in embedded SVGs

https://gerrit.wikimedia.org/r/403577

Volker_E added a subscriber: Etonkovidova.EditedJan 11 2018, 3:37 AM

In my testing of 403577 everything works fine with IE 10-11 & Edge, but given its UI significance it would be preferable to have QA support by @Etonkovidova this time.

@Volker_E I will check it tomorrow.

Volker_E added a comment.EditedJan 12 2018, 1:50 AM

@Etonkovidova Thanks, but it's too early, was meant for a heads-up. We are still going back and forth on the patches – will keep you posted.

Change 403093 merged by jenkins-bot:
[mediawiki/core@master] CSSMin: Remove XML declaration from SVGs

https://gerrit.wikimedia.org/r/403093

Volker_E updated the task description. (Show Details)Jan 14 2018, 1:58 AM
Volker_E updated the task description. (Show Details)Jan 14 2018, 2:10 AM
Krinkle closed this task as Resolved.Sep 19 2019, 6:37 PM

Going to consider this solved. We've implemented and deployed all but one of the planned optimisations. The last one is proving difficult and somewhat incompatible with free-form SVG syntax. Worth exploring later in a separate task perhaps.

Krinkle moved this task from Untriaged to Archive on the Performance-Team-publish board.

Change 403577 abandoned by VolkerE:
CSSMin: Re-introduce improved quote encoding in embedded SVGs

https://gerrit.wikimedia.org/r/403577