Page MenuHomePhabricator

Create minified SVG output in thumbnail space to serve for <img>s
Open, LowPublic

Description

Source SVGs may contain extra white space and comments which are unnecessary data transfer for thumbnail usage. Consider minifying and serving those.

Event Timeline

brion renamed this task from Create minimized to Create minified SVG output in thumbnail space to serve for <img>s.May 5 2016, 2:28 PM
brion updated the task description. (Show Details)
Restricted Application added subscribers: Steinsplitter, Matanya. · View Herald Transcript

Does that involve only text minification or also converting to .svgz?

It is probably simplest to let the web server / proxy layer deal with gzipping, unless handling of .svgz has gotten a lot more consistent than I remember.

While doing this minification, it might be necessary remove languages different from the content language of the article the SVG is used in (i.e. the content language) from [[ https://commons.wikimedia.org/wiki/Help:Translation_tutorial#Using_the_same_file | SVGs containing translations in <switch> elements ]]. Otherwise the browser might decide about which language is best suited resulting in undesired behaviour for visitors of e.g. public libraries? [Not extensively tested yet ... but the lang attribute on the HTML element appears to be ignored in Firefox]

MarkTraceur moved this task from Untriaged to Tracking on the Multimedia board.

If we mean that a minimized SVG-version of the uploaded SVG should be saved: I would say a clear no. (If we talk about client-side-rendering that might be different.)

Comments and Whitspaces should imho not be removed to safe some disk space, since they are essential for manual editing in a Text-Editor or Commons:SVGEdit.js.

Inkscape and the WMF SVG-Translate-Tool are famous for bloated useless definitions. However SVG-Minimizers (such as: scour, svgcleaner [developed by @RazrFalcon ] and expecially svgo [recommended by @TheDJ]) have imho many bugs and are/were hardly maintained, see https://commons.wikimedia.org/wiki/Help:SVG#Tidying_up . Those Optimizer can be used online (i.e. without installing) e.g on https://svgworkaroundbot.toolforge.org/ , but imho should not be done automatically (too dangerous).

More infos about Optimiziation e.g. at https://commons.wikimedia.org/wiki/User:JoKalliauer/Optimization.

So the question is what do we mean by minimization? If it means removing of intents and comments that is imho deprecated (what this bug report imho originally reported). If it means optimizing files, that's very dangerous and should be done externally (What the bug reported might intend). If optmizing should really be done internally in MediaWiki, it should be as an opt-in-feature but only with mostly "safe" options plus librsvg-bug-fixing (but that should be imho another bug-report).

The goal here wouldn't be to save disk space, but to make network transfers faster. The SVG optimization wouldn't affect the original file accessible on the file page, but what would be included inline on an article (instead of the current png thumbnails). I'd say the removal of non-rendering components (whitespace and comments) would be minimum expected.

I would decline this task.

The description worries about "extra white space and comments which are unnecessary". There is no claim to how significant removing those items would be. In my experience, most SVG files are generated by applications and have few comments. Even files generated by hand have few comments. Neither do I see white space as a significant issue; white space compresses very well.

Brion's T134490#2269270 points out that the server will compress the SVG before transferring it. White space will compress well. English comments will also compress well.

Rilke's T134490#2279887 raises systemLanguage issues, but that seems to be out of scope. WMF needs to figure out how it will serve SVG files (such as by localizing i18n SVG before passing it to the browser), but that is not about removing comments and white space. There is a USA map with state names in over 50 languages; I think the graphics are 200 kB and the translations are antoher 200 kB. Striping irrelevant language would cut the file in half, but that file is an exceptional case. Most diagrams will have a few labels rather than 50, and they will have a small set of languages.

Removing comments and white space is essentially about lossless compression. The removal would not change the rendering at all. Using tools such as svgo and SVGCompress are about lossy compression, and that is a dubious activity. Rounding coordinates to 6 digits is usually reasonable, but what if all those coordinates share a 4-digit offset? If the SVG is too big, then render it as a PNG and ship the PNG.

Optimizers can do lossless compression by removing or defaulting some attributes. Inkscape files are often bloated with over-specified attributes. However, such files have lots of redundancy and will compress well. There will be many repetitions of CSS properties with style attributes such as font-family:Liberation Sans;font-size:14.37872;font-weight:normal;. The metric should not be how much the SVG file size is reduced; it should be how much the compressed file is reduced.

I do not see stripping comments and whitespace saving much network bandwidth. More radical optimizations should be done to the file directly rather than being something applied to the thumbs at the last minute.