Page MenuHomePhabricator

SVG client side rendering for specific SVGs
Open, Needs TriagePublic

Description

In the Community_Wishlist_Survey_2019 it was suggested to support client-side-rendering of SVGs (T5593). But since that might lead to a not uniform rendering of few SVGs or in some cases increases the file-size, this should only be done, if specified by the user.

One possible to solution might be to support a feature like [[File:Filename.svg|thumb|vector]].

  • Also the browser support is much better, it might occour in few cases that a File is rendered differently on clientside, even though different rendering occurs less often than a librsvg-bug, it is difficult to know all the possible different renderings before you embedd it as svg. Therfore it might be necesarry to define not allowed svg-features for client-rendering (T134410), f.e. language switch (T60920) might need a workaround to serve several svgs (similar to now severals pngs).
  • The technical possibilty to a direct svg output might be srcset (T134455).
  • Because of performance-reasons it will be usefull to define a maximum SVG-size of client-side-rendering.
  • It might be usefull to avoid extensive useage of this function it might be usefull to set a flag to allow client-side-rendering, but this should only be done by f.e. Patrollers. (T5593#1684608)
  • Also users might have the choise to change settings in the preference if they prefer PNGs or SVGs (T5593#958597)

Event Timeline

In Internet Explorer 9 to 11, SVG scaling only works properly if the SVG has a properly-defined viewBox attribute; without this, the SVG is rendered at its native size and clipped to the dimensions of the img element. So at minimum, this feature would need to ensure that SVG files exposed in this way have a viewBox attribute, and maybe add one if it's not present.

A better option would always serve small (say less than 20kB) SVG files.

At this point in time, SVG is a common standard for displaying images on the Web. It has good support in most browsers. That support is often better than the support offered by librsvg, which does not support textPath or vertical Chinese or correctly implement BiDi. Chrome and Firefox do a better job than librsvg.

In the past, there may have been poor support for SVG, but that is not the case today. Most browsers have reasonable support for SVG. Yes, there will be some issues, but MW should serve SVG by default whenever reasonable.

Even support for switch-translated files has improved recently. Both Firefox and Google have fixed case-sensitive langtag bugs, so their handling of switch is now better than the librsvg used by MW: both correctly distinguish "zh-Hans" and "zh-Hant", but MW's librsvg cannot. Edge works correctly if the systemLanguage attribute has only one langtag; I do not know what IE does. Some SVG files on Commons will not display reasonably, but that problem is flawed SVG that should be fixed anyway. Many SVG files do not sort the langtags, so multilingual browsers may display a mix of languages.

SVG can also do animations. I posted an animation the other day: https://commons.wikimedia.org/wiki/File:DIN_69893_hsk_63a.svg If SVG were served, then such animations would work for Chrome, Firefox, and other browsers. Chrome and Firefox are 2/3 of the browser market share. https://caniuse.com/#search=SMIL says only IE, Edge, and Opera Mini do not support SVG animation. (Google has considered dropping SMIL. CSS animation, even 3D CSS animations, are now possible, but they make me queasy.) IE and Edge are only 5% market share, and they would still display the SVG -- they just would not animate it, which is the current state of affairs anyway. Currently, I think the only live animations on MW are GIFs.

There are significant problems with serving SVG.

Yes, x, y, width, height, and viewBox are an issue. Many files on Commons are broken and only work because librsvg happens to resolve conflicts a certain way. A bot can enforce librsvg semantics on the attributes.

Font support is an issue. Many files on Commons use fonts that are not on all user agents. Those files should be using generic font fallbacks, but just adding font fallbacks is not enough. The files may need their anchor points changed or even additional font specifications (Frutiger 75 is a bold font; Arial Black is a heavy font). The issue is really a problem with the SVGs. They are often tweaked to work with librsvg but not with other user agents. WM could probably live with the upset; it's just a signal that some files need to be fixed. The upset means some text looks a little odd, but for the most part the image is still readable and usable.

File size is an issue. Commons has many beautiful images that are just too damn large. Inkscape is notorious for producing bloated SVG files that specify the entire graphics state on each and every element. Some of that can be trimmed automatically (even by just using a different output option in Inkscape). Some file size issues would be fixed by serving SVG. According to caniuse, all user agents can do textPath. https://caniuse.com/#search=textPath That is, all user agents except librsvg. When a Commons graphics artist wants to put text on a path (for example, to put the name of a meandering river river on a map), the text must be converted to curves (most tools) or the letters must be placed individually (I think Adobe Illustrator can do this). Either approach takes more bytes. If the SVG were served, the images could use the more efficient (and easily translateable) textPath element. Even if a file does not convert text to curves, it may be too bloated to serve directly. Some illustrations have too much detail in them to be a small size. Other illustrations may have 400 kB of translation information when the user is only going to see 4 kB of text. Unless gzip can perform miracles on those files, it would take less bandwidth to serve PNG renderings.

I'm not a server wonk, but I suspect that WMF servers do not want data bandwidths to surge. Given a choice between serving a 15 kB PNG thumbnail or a 200 kB SVG, WMF would probably go for the 15 kB PNG. Many 200 kB SVG may compress well because they have common strings (e.g., stroke-width:1;stroke-miterlimit:4), but I do not know how well.

MW may soon have a crisis with librsvg. There have been bug fixes that MW has not incorporated yet because Gnome has converted the source to Rust. It is possible that the new versions of librsvg will no longer work with many switch-translated files. Bug fixes to librsvg have also taken a long time; there's was a period that librsvg lay dormant. If MW continues to convert SVG to PNG, then it is susceptible to the issues with that converter. Already, Commons has too much focus on getting an SVG file to work on librsvg rather that making a good, all-round, SVG file. We are changing fonts to those only available on Commons, and we are scaling images to avoid a librsvg small font problem. Commons:Commons:Commons SVG Checker is about problems with librsvg rendering.

There are not many suitable replacements for librsvg. Most options appear to be substantially slower. An option is on the horizon, but it does not have the required feature set yet. Serving SVG would get around many issues.

Yes, adding a parameter and value such as file=svg might be expedient, but I do not think it is a good option in the long run. I do not want to add such a parameter to 1 million en.WP articles that include a small SVG. Neither do I want an uprising of editors complaining about enormous activity on their watchlists.

A better option would be to flag the SVG file on Commons with a serve-directly property. That way, if a file is included in 100 articles, we do not have to edit all 100 articles. Set one flag, and the rest is magic. And if we need to back out the choice to serve directly, then we do not have to re-edit 100 articles. The choice to serve logically goes with the file rather than the inclusion.

But I do not like that option either. I do not want to edit 500,000 SVG files just to set a flag.

The better option to just include any small SVG file. No article or file editing ks required. If the SVG file is less than x bytes, then it gets served rather than converted to PNG. There can be some trouble. Somebody might suddently bloat an SVG file to 10 MB, and then all those articles that expected to download a small SVG file suddently choke on a 10 MB monstrosity. But that misbehavior would happen with the file=svg flag, too. The thumbnail server could just refuse to serve large SVG files.

What about an other way: What about treating a couple of tags out of SVG set to create html5 content out of Wikimedia code? To enhance the Wiki code syntax would allow small efficient inline SVG being completely save and enhancing capabilities of drawing genealogical trees with lines instead of table borders or of drawing hieroglyphs the way they are on real life samples. Charts could show interrelations with links to corresponding pages. This would be a great deal and an extensible step in Wikimedia development.

A better option would always serve small (say less than 20kB) SVG files.
[..]
The better option to just include any small SVG file. No article or file editing ks required. If the SVG file is less than x bytes, then it gets served rather than converted to PNG. There can be some trouble. Somebody might suddently bloat an SVG file to 10 MB, and then all those articles that expected to download a small SVG file suddently choke on a 10 MB monstrosity.

Wouldn't the server access the PNG instead then? Simple rule - simple solution!

Change 921379 had a related patch set uploaded (by TheDJ; author: TheDJ):

[mediawiki/core@master] Add option to allow SVGs to be rendered clientside

https://gerrit.wikimedia.org/r/921379

I've given this another go.
The SVG landscape changed quite a bit and because of that, browser support is much more solid and because of that, fallback to a PNG has become less important. I also think that this feature has become very desirable for 3rd party wikis, which is another reason to provide it within MediaWiki.

I think the approach I have chosen is taking care of the biggest pitfalls for SVG.

  1. SVGs can potentially be too big/complex in comparison to a PNG of the rendered contents. I've gone with a very simply byte cutoff. Is it ideal ? Can PNGs for some of these still be smaller than the originals. Of course this is possible, but this approach will dance the 80/20 line quite well I suspect.
  2. SVGs can be translatable, but the browsers don't support this. Simply use PNG whever an SVG has content in more than 1 languagecode.

This leaves the problems of:

  1. Render consistency across browsers. I think this has become much more stable and predictable over the last couple of years, and is less of a problem than it used to be.
  2. Font support on the client side. I think most of the time this isn't the biggest problem, but for very specific SVGs it might be. One potential solution, might be introducing a keyword for transclusions, allowing you to define if you want the SVG or a PNG (like lossy and lossless keywords)
  3. Security. Serving up SVGs raw might increase the security problems. However
    • We do scanning upon upload
    • These will be served as <img> elements and thus have to comply with all the img policies of browsers, so if there is a problem with SVG, then browser vendors all have this problem
    • We already serve raw SVG originals if you click a link on the File page

I think for that reason that especially for 3rd party MediaWiki's, this patch is useful. For WMF, some more additional work might be required.

Solution for second problem: If we provide a single font set (FreeSans and FreeSerif, or Liberation Sans, Liberation Serif and Liberation Mono e.g.) hosted on Wikimedia we can expect svg using these fonts and hence rendering the same on all clients.

Scanning on upload might include svg language restrictions and font restrictions in addition. So there is no reason not to provide client side rendering of unproblematic svg only.

We do scanning upon upload

Unfortunately, this is probably not sufficient. We know there are originals on Commons that predate the scanner that contain JavaScript.

The approach is reasonable.

I have briefly scanned some of the code, but I'm not an expert.

The specification for img does not run script elements, so even old uploads should be safe. I believe WMF may rely on the img specification for those old uploads. That protection may also work for disallowed, malicious, DTDs.

The img element does run animations, so SVG animations may induce seizures: T85838. (Please fix comment on line 191 of SVGImageHandler.php.)

If an SVG should not be served, then adding a switch with two systemLanguage clauses would force using the PNG.

What happens if a wiki page is built when an SVG is 20 kB but then the SVG is replaced with a 5 MB version? My impression is the 5 MB SVG gets served. That may be unacceptable for mobile users. The thumbnailer might be a better place to implement the feature.

I see most font problems as weaknesses in the SVG source. Authors should be using generic fallbacks and leaving plenty of space. In addition, client-side rendering means that textLength would work. The real problem is when unusual fonts (e.g., Siddham script support) or precise font-metric matches are needed (e.g., Unicode music fonts do not have consistent metrics).

Multilingual SVG is a thorny problem.

We do scanning upon upload

Unfortunately, this is probably not sufficient. We know there are originals on Commons that predate the scanner that contain JavaScript.

We're not inline'ing though. We are using them as <img> elements, and <img> elements don't run SVG javascript and do not load external resources:
https://developer.mozilla.org/en-US/docs/Web/SVG/SVG_as_an_Image
Only opening them as the original images or as <embed>'ed elements should really do that.

It can be argued that by having the images as thumbnail transclusions, svgs become more visible and thus the likelyhood of people making their way to the original increases, for instance by right clicking and choosing "open image in new tab".
I'm pretty sure that the security team would have to do a major review and risk assessment before anything like this can be enabled for WMF, which might include looking at the history of browsers and their SVG security history.

At the same time, I think this is an option that should at least be available 3rd parties.

The img element does run animations, so SVG animations may induce seizures: T85838. (Please fix comment on line 191 of SVGImageHandler.php.)

Yeah we don't really have something for this, but if it is a problem, we can simply exclude animated SVGs from this condition.

What happens if a wiki page is built when an SVG is 20 kB but then the SVG is replaced with a 5 MB version? My impression is the 5 MB SVG gets served. That may be unacceptable for mobile users. The thumbnailer might be a better place to implement the feature.

This is something I had not thought about yet. I think it would be pretty rare, and pages will get purged (because for instance the aspect ratio of images would change, so the cache has to be invalidated). There would however be a small period where this could be a problem and i'll see if I can think of some mitigation for that...

At the same time. It is possible right now to force 20MB towards users. Just include 200 100kB thumbnails, and our solution for that is to simply revert that. So as the purge would correct this within a reasonable timeframe, i think it might not be a blocking problem.

Multilingual SVG is a thorny problem.

It sure is.

We 've had that already. If we have a rule to support client side rendering for small svg only and a small svg is replaced by a big one obviously the svg is not served any longer but its png representation is instead. I'd appreciate a warning at upload in that case.

And about multilingual: I can't see any necessity for them.

Change 921379 merged by jenkins-bot:

[mediawiki/core@master] Add option to allow SVGs to be rendered clientside

https://gerrit.wikimedia.org/r/921379

Reminder that i have to document this on mediawiki.org

It appears that the default value of $wgSVGNativeRendering is set to the string 'false'.
Therefore, I have written it as such on MediaWiki.org.
However, wouldn't it be more appropriate to set the default value as the boolean false instead?
https://www.mediawiki.org/wiki/Manual:$wgSVGNativeRendering

As far as I understood this is no boolean value but should default to something like 'partial' in future. (Want it asap!)

Change 923729 had a related patch set uploaded (by TheDJ; author: TheDJ):

[mediawiki/core@master] wgSVGNativeRendering default should be false, not 'false'

https://gerrit.wikimedia.org/r/923729

It appears that the default value of $wgSVGNativeRendering is set to the string 'false'.

Whoops, that's an oversight. Apparently i had never pushed the fix for that from my local branch. Thank you for spotting that.

Change 923729 merged by jenkins-bot:

[mediawiki/core@master] wgSVGNativeRendering default should be false, not 'false'

https://gerrit.wikimedia.org/r/923729

Adding hackathon label to note that this was moved forward during the hackathon.

@Arlolra I was thinking about perhaps reusing the lossless/lossy keywords for the SVG Handler in order to allow people to choose between the plain SVG and the PNG version. Would that require any parsoid adaptions, considering they already exist for the paged tiff handler, but not yet in core ?

https://www.mediawiki.org/wiki/Extension:PagedTiffHandler

Possible values for jpg: '1', 'true' and 'lossy'. Possible values for png: '0', 'false' and 'lossless'.

@Arlolra I was thinking about perhaps reusing the lossless/lossy keywords for the SVG Handler in order to allow people to choose between the plain SVG and the PNG version. Would that require any parsoid adaptions, considering they already exist for the paged tiff handler, but not yet in core ?

In core, there's an allow-list of keys,
https://github.com/wikimedia/mediawiki/blob/master/includes/parser/Parser.php#L5198-L5204

and then each handler can extend that with MediaHandler->getParamMap()
https://github.com/wikimedia/mediawiki/blob/master/includes/parser/Parser.php#L5212

So, I don't see any issue with the SVG handler reusing the keyword Tiff is already using.

Parsoid, for its part, is currently broken in that regard, it only has the allow-list,
https://github.com/wikimedia/mediawiki-services-parsoid/blob/master/src/Wikitext/Consts.php#L39-L83

This came up recently in https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/952947 and needs to be fixed independent of anything you're doing here. I'll file a task for that.

considering they already exist for the paged tiff handler, but not yet in core ?
https://www.mediawiki.org/wiki/Extension:PagedTiffHandler

Looks like Parsoid doesn't support that yet

So why not extend the list of allowed tags by 'svg', 'g' and 'path' at minimum? Even if we 'd add a couple of other tags more we wouldn't increase the risk of security flaws or huge image data chunks in a Wiki article. But we'd be able to visualize a number of facts much more clearly, right away.