Page MenuHomePhabricator

RFC: Re-evaluate librsvg as SVG renderer on Wikimedia wikis
Open, LowPublic

Description

I don't know the exact history, but at some point Wikimedia wikis added the ability to support inline SVGs by passing them through librsvg, which takes the SVG code and generates PNGs, as I vaguely understand it.

There are some notes here: https://meta.wikimedia.org/wiki/SVG_image_support.

I can't find any information about which version of librsvg Wikimedia is currently using, but the choice of using librsvg should be re-evaluated, given its rendering issues (cf. other bugs in this bug tracker) and the existence of perhaps better alternatives.

See Also:
T53555: librsvg seems unmaintained
T120746: Improve SVG rendering
T10901: [DO NOT USE] SVG rasterisation and management on Wikimedia sites (tracking)

Details

Reference
bz38010

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@daniel

how many files on commons would break if we switched away from rsvg tomorrow?

Most of the files will actually became correct. There are obviously some bugs in resvg too, but it's far beyond rsvg in SVG support.

Can we list them?

I don't think that this is possible. Yes, an SVG checker can be written that will check files for common rsvg bugs, but it's still be a speculation.

Clarifying question. If I could run all our SVGs through a checker using resvg to find out what bugs they suffer from, is it possible to write such a checker?

Common librsvg-bugs are known and some of them can be checked by Commons:Commons SVG Checker which are listed here: https://commons.wikimedia.org/wiki/Commons:Commons_SVG_Checker/KnownBugs
But writing a bug-checker for malformed files for librsvg-bugs (I did not know such files exist) is maybe even more difficult than render the file correclty. (So ~impossible) (For some few it is maybe possible, but there exist imho not a common one.)

If so, we could catalog and get a better sense of how much pain this migration would entail.

If we consinder correct rendering as a pain, we can't update any software any more not apply any patches, not any bugfixes. I consider it as a pain of the authors, but however the old PNG-thumbs on Commons/Wikipedia don't get updated as long as no one reuploads and no uses ?action=purge. (I purged 2019 solves images wich were rendered in 2016 or earlier.)

If not, is there an estimate of how widespread use of malformed files is?

I checked 126files in Category:Featured_pictures_on_Wikimedia_Commons_-_vector, because I consider those to be as one of the highest quality images on Commons, which are mostly used and generally complex.

  • 8 Files were affected by T246014, see last 8 pics in https://github.com/RazrFalcon/resvg/issues/223 (but you could change dpi in rsvg (default 90dpi) aswell as in resvg (default 96dpi)), but this not a render-issue, this is imho outdated preferences (at least nowadays it is a unusal preference). Maybe we should block such files, since there is no correct definition how to render them.
  • 1 File T246003
  • 1 File T246001
  • 1 File T245864
  • I might missed a few, maybe one or two

Several files, where the behaviour is unspecified:

Sometimes it might be difficult to decide who renders more precise: (But I do not consider that as a bug)

I think in Category:Featured_pictures_on_Wikimedia_Commons_-_vector were ~3 files with a librsvg-bug all with the same bug

  • ~3files with T11420 (path-text not rendered)

But in all those files you do not notice anything, if you do not compare rendeing with a second one (so it is not obvious).

So this means it is the ~same number of Svg which are rendered intendently wrong and unindentdly wrong by librsvg.

Wouldn't it be best to use the tool that created a SVG for conversion to PNG as well, for one would get on commons exactly what they would get at home? No library can beat that, I am afraid.

My experience is that files that look the same in Inkscape, Firefox and Chrome usually do not work as expected on commons (blurs, text, some clones, hatch fills are usually broken). But how do you fix them than by trial and error - you upload, check all the elements, go back to Inkscape, remove one fancy feature, upload again, go back to Inkscape, turn text to paths, unclone, etc.

Inkscape slow? How many SVG conversions a day are we talking about? When profiled, was Inkscape tested in its ''shell mode'' (inkscape --shell)? That way export actions could be sent through a pipe and Inkscape would not need to be started for every single conversion.

This comment was removed by Milimetric.

@Aklapper T19012 is imho for attaching source files, but @Ponor is imho talking about converting Inkscape-SVG to PNG by Inkscape and Batik-SVG to PNG by Batik, and Adobe-SVG to PNG by Adobe Illustrator, ....

Wouldn't it be best to use the tool that created a SVG for conversion to PNG as well, for one would get on commons exactly what they would get at home? No library can beat that, I am afraid.

e.g. Adobe Illustrator is proprietary and won't be used by Wikimedia. That is also the reason why fonts like Arial or Times are not allowed.

My experience is that files that look the same in Inkscape, Firefox and Chrome usually do not work as expected on commons (blurs, text, some clones, hatch fills are usually broken). But how do you fix them than by trial and error - you upload, check all the elements, go back to Inkscape, remove one fancy feature, upload again, go back to Inkscape, turn text to paths, unclone, etc.

  1. Inkscape uses partly SVG 1.2 and SVG 2.0 features, which are not allowed in the current SVG 1.1-standard, since Firefox and Chrome support such features and librsvg not. Librsvg is the only program that renders it correctly according to the current standard (basically SVG 1.1 says if you use a undefined SVG1.1-feature it should be ignored, that's not what Chrome/Firefox are doing, they render it according to the working-draft SVG2.0).
  2. You can upload it to https://commons.wikimedia.org/w/index.php?title=Commons:Commons_SVG_Checker&withJS=MediaWiki:CommonsSvgChecker.js and you will see the results and it will get checked for commons librsvg-errors.
  3. You can use https://svgworkaroundbot.toolforge.org/ (activate run svgcleaner and activate scour) and it fixes most problems without visual change, some bugfixes are listed at https://commons.wikimedia.org/wiki/User:SVGWorkaroundBot .

Inkscape slow? How many SVG conversions a day are we talking about? When profiled, was Inkscape tested in its ''shell mode'' (inkscape --shell)? That way export actions could be sent through a pipe and Inkscape would not need to be started for every single conversion.

Inkscape is for creating SVGs not for converting SVGs and is more than 6times slower than librsvg and will run into time-out see T200866

According to https://en.wikipedia.org/wiki/File:Commons_Growth.svg it is currently 20000files per day, about 2.8% are svg, that is roughly 500 svg-files per day. If you check the newest SVGS https://commons.wikimedia.org/wiki/Special:NewFiles?mediatype[]=DRAWING&wpFormIdentifier=specialnewimages it seems it is more like 1000 svg-versions per day.

SVGs on average have maybe 2versions and get rendered in min. 14 different sizes, thats roughly 30 pngs per svg.

And adding more software will in the long run lead to more problems. Librsvg did an enormous progress since version 2.40 (2016) and most librsvg-bugs (I guess 80% of current problems) are already fixed, see subtasks of T193352 , however updating seems to be more challenging.

PS. I fixed >500 svg-files in Category:Pictures_showing_a_librsvg_bug_(overwritten_with_a_workaround) and other categories so I can definitely say a better renderer would have saved many man-hours. But for wikimedia I'm cheaper (volunteer) than larger servers (costs).

According to Grafana, eqiad and codfw each get an average of 0.8 queries for new SVGs per second, with spikes up to 4 qps. More than 75% of those requests are handled using 575ms of CPU time on average. For context, there are 8.4 requests per second to eqiad and codfw for filetypes handled by imagemagick, including SVGs, which use 2-4s of CPU time.

@Ponor is imho talking about converting Inkscape-SVG to PNG by Inkscape and Batik-SVG to PNG by Batik, and Adobe-SVG to PNG by Adobe Illustrator, ....

Not exactly, I was thinking that 'inkscape --shell' should do Inkscape-SVG conversions. In my random sample on commons 24/30 files were made with Inkscape. Other producers could use either 'inkscape --shell' or whatever is used now.
For, you see, Inkscape will always be the best converter for whatever Inkscape can produce. I hope we can agree on that.

  1. Inkscape uses partly SVG 1.2 and SVG 2.0 features, which are not allowed in the current SVG 1.1-standard, since Firefox and Chrome support such features and librsvg not. Librsvg is the only program that renders it correctly according to the current standard (basically SVG 1.1 says if you use a undefined SVG1.1-feature it should be ignored, that's not what Chrome/Firefox are doing, they render it according to the working-draft SVG2.0).

Is SVG2.0 forbidden on commons? What happens when I upload a 2.0 file? Again, Inkscape converting its own SVG files to PNG should always work, regardless of SVG version.

  1. You can upload it to https://commons.wikimedia.org/w/index.php?title=Commons:Commons_SVG_Checker&withJS=MediaWiki:CommonsSvgChecker.js and you will see the results and it will get checked for commons librsvg-errors.

It took some time to discover this, and yes, it helped. But we're doing there what a computer should do without us having to worry.
Then, check this file and how it's rendered at different resolutions: https://commons.wikimedia.org/wiki/File:Scanning_tunneling_microscope_-_tip,_barrier_and_sample_wave_functions.svg. It only half-works, and the above test is of little help.

Inkscape is for creating SVGs not for converting SVGs and is more than 6times slower than librsvg and will run into time-out see T200866

'inkscape --shell' to which actions can be sent through a pipe does not seem that slow at all. I'll post my results.

SVGs on average have maybe 2versions and get rendered in min. 14 different sizes, thats roughly 30 pngs per svg.

What I'm seeing is that they have 1 version on average and are rendered in some 8 sizes (roughly 200, 500, 1000, 2000 twice). Most files uploaded daily on commons are very simple <100kB SVGs. 'inkscape --shell' converts those in less than 0.25s per png (on my little linux laptop).

And adding more software will in the long run lead to more problems. Librsvg did an enormous progress since version 2.40 (2016) and most librsvg-bugs (I guess 80% of current problems) are already fixed, see subtasks of T193352 , however updating seems to be more challenging.

I'll repeat: 'inkscape --shell' will convert properly everything Inkscape made, now and forever. Not sure how that increases complexity (e.g. here https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/thumbor-plugins/+/refs/heads/master/wikimedia_thumbor/engine/svg/svg.py)

PS. I fixed >500 svg-files in Category:Pictures_showing_a_librsvg_bug_(overwritten_with_a_workaround) and other categories so I can definitely say a better renderer would have saved many man-hours. But for wikimedia I'm cheaper (volunteer) than larger servers (costs).

I think this is doable, even as an option for an advanced uploader. Given how much crap (pardon my f.) is getting uploaded each day, a few extra milliseconds for a good PNG<-SVG is a small price to pay.
C'mon Wikimedia, WP:BEBOLD.

@Ponor: If I don't misunderstand then the argumentation seems mostly about stuff created with Inkscape. What about stuff not created with Inkscape?

According to Grafana, eqiad and codfw each get an average of 0.8 queries for new SVGs per second, with spikes up to 4 qps. More than 75% of those requests are handled using 575ms of CPU time on average. For context, there are 8.4 requests per second to eqiad and codfw for filetypes handled by imagemagick, including SVGs, which use 2-4s of CPU time.

Thanks for this info. I did two little tests on my linux laptop. First, I took 8 SVGs from "Category:Featured pictures on Wikimedia Commons - vector" (Inkscape:6, CorelDRAW:1, Illustrator:1; file sizes 100k, 2×150k, 300k, 400k, 700k, 1400k, 2200k) and ran 'inkscape --shell' actions (of this type 'file-open:AntigenicShift_HiRes.svg; export-type:png; export-width:600px; export-do;') for all 8 at once. Got this:
png width 300px: total 5s ⟶ 0.6s/file
png width 600px: total 6s ⟶ 0.8s/file
png width 1200px: total 9s ⟶ 1.1s/file
png width 2400px: total 18s ⟶ 2.2s/file
Not bad, huh?

Next, I took 30 random files (mostly flags, maps and logos) uploaded between 2015 and 2020 on commons. Mean file size 700k, quartiles 1, 13, 103, 915, 9500 kB. Of those, 20+4 were made with inkscape, 2 with Illustrator, 2 with Batik, 1 with matplotlib, 1 with gnuplot.

width 300px: total 15s ⟶ 0.5s/file
width 600px: total 18s ⟶ 0.6s/file
width 1200px: total 30s ⟶ 1s/file
width 2400px: total 62s ⟶ 2s/file

6 smaller files (26k, 95k, 19k, 2k, 13k, 180k) scaled to 4 widths (300px, 600px, 1200px, 2400px)
total time 6s ⟶ 1s/svg scaled to 4 widths ⟶ 2s/svg scaled to 8 png widths (typical for commons)

6 medium/large files (these are not too frequent) (460k, 900k, 780k, 520k, 1.3k, 1.3k) scaled to 4 widths (300px, 600px, 1200px, 2400px)
total time 50s ⟶ 8s/svg scaled to 4 widths ⟶ 16s/svg scaled to 8 png widths
5 of the last 6 file, without the slowest one (Afgewezen ontwerp van het provinciewapen Gelderland, 1893-1941.svg)
total time 21s ⟶ 4s/svg scaled to 4 widths ⟶ 8s/svg scaled to 8 png widths

4 s to convert each svg (offline, uploader does not wait for conversions to end), 1000 svg files a day, that's 1.1 hours of CPU time for wikimedia, no big servers, just my laptop.

The problem isn't as much the amount of SVGs we get per day, than the fact that we render thumbnails on demand when they're for a file/size combination never requested before. Any extra rendering time is a penalty for that viewer. The issue compounds if they request a lot of new thumbnails at once, making them more likely to run into throttling limits, resulting in erroring images. That can easily happen on galleries that get visited very rarely. But some people's workflows get them to visit those a lot and their overall experience becomes terrible.

We prerender the most common sizes at upload time, but there's a very long tail of more exotic thumbnail sizes requested because editors customised the sizes they wanted with wikitext, or the wiki itself has different defaults, etc.

SVG isn't the only format that would benefit from a more time-consuming encoding yielding higher fidelity or a smaller file, what we really need is a pipeline for improving existing thumbnails with better encoding asynchronously. On misses generate a fast, inferior thumbnail so that the first user gets something quickly and spawn an async job that will generate the ideal thumbnail for the next person to view it. That's quite an ambitious project that I can't undertake at the moment. Maybe that can be my next big project after I complete migrating our Thumbor service to Buster/Python3/Thumbor 5/Kubernetes, which is probably going to keep me busy for several months.

@Ponor: If I don't misunderstand then the argumentation seems mostly about stuff created with Inkscape. What about stuff not created with Inkscape?

I'm focusing on Inkscape because it's free and most often used to make SVGs (80% of all uploads?). Also, with Inkscape we know that every Inkscape-SVG to PNG conversion will work, I mean, it should, and this conversion can be checked by the authors beforehand.
Most files produced by matplotlib, gnuplot, batik (?) are very simple, and png conversion in Inkscape for them should also work, but this would have to be tested; worst case, stay with whichever converter is being used now.
For Illustrator and CorelDRAW files (10% of all uploads max?), I'd say it doesn't really matter, use librsvg or 'inkscape --shell', that's trial and error now, and will be trial and error then.

In short, things can stay the same or 'inkscape --shell' can be used for files generated with other software; 'inkscape --shell' for Inkscape SVGs gives a lot more predictable results (+some nice features that are missing in current converters).

The problem isn't as much the amount of SVGs we get per day, than the fact that we render thumbnails on demand when they're for a file/size combination never requested before.

Thanks for this clarification, quite interesting! I actually thought that you only serve those PNGs that have been cached or stored when the SVG was uploaded, given the fact that PNGs on WP look a bit blurry, unlike SVGs scaled to the same size in the very same browser. It really surprises me that you're generating the exact requested size every time. Why not just send the closest (bigger) PNG and let the client scale it to the exact size (www style)?
But anyway, I was more concerned about predictability of SVG to PNG conversion as someone who sometimes makes and uploads SVGs, and wanted you to (re)consider using 'inkscape --shell' for this conversion, at least for Inkscape-generated files.

For, you see, Inkscape will always be the best converter for whatever Inkscape can produce. I hope we can agree on that.

I prefer Inkscape or resvg compared to the current rsvg, however I can not fully agree on that:
Files should be SVG-Files not Inkscape-Files, otherwise making derivatives/improvements/translations will be difficult. More Infos why Wikimedia only allows free format files: Commons:File_types

All Inkscape-svg-files are not valid according to the SVG-document type definition.
E.g. one of your recent files: https://validator.w3.org/check?uri=https%3A%2F%2Fupload.wikimedia.org%2Fwikipedia%2Fcommons%2F7%2F73%2FDivergence_of_a_vector_field_in_the_rectangular_coordinate_system_-_derivation.svg&charset=%28detect+automatically%29&doctype=SVG+1.1&ss=1&group=0&user-agent=W3C_Validator%2F1.3+http%3A%2F%2Fvalidator.w3.org%2Fservices

Generally invalidity is not a problem, but if those inalid attributes significanlty effect the render-result it is a problem. (It is a problem of the file; It is not a problem of the renderer)

Is SVG2.0 forbidden on commons? What happens when I upload a 2.0 file? Again, Inkscape converting its own SVG files to PNG should always work, regardless of SVG version.

Basically SVG is SVG, so you cannot generally distinguish them and newer standards (generally) support older features. The newest standard is (currently) SVG 1.1, everything else are more or less private functions, that do not guaranty to work anywhere else. In Category:Images_with_SVG_2.0_features you can see what happens if you use SVG 2.0-features. (Sometimes you have to check the file-history.)

Then, check this file and how it's rendered at different resolutions: https://commons.wikimedia.org/wiki/File:Scanning_tunneling_microscope_-_tip,_barrier_and_sample_wave_functions.svg. It only half-works, and the above test is of little help.

You just need to edit width="516.3" height="324.3" to height="648.6" width="1032.6" viewBox="0 0 516.3 324.3" and you have the image in another resolution, or you use File:Test.svg.

I'll repeat: 'inkscape --shell' will convert properly everything Inkscape made, now and forever.

SVG is a developing format, it is difficult to predict future. For example Inkscape changed in 2014 the dpi from 90 to 96, which can lead to rendering-issues (Wrong borders).

PNGs on WP look a bit blurry

  1. On Windows it can be e.g. due to a scaling different than 100%, my laptop had e.g. 125% as default.
  2. Antialising makes edges blurry, see :w:en:Spatial_anti-aliasing. To disable it you can try to use shape-rendering="crispEdges" see https://developer.mozilla.org/en-US/docs/Web/SVG/Attribute/shape-rendering .(however I think it is not supported by librsvg)
  3. feGaussianBlur is buggy, generally too strong, with the librsvg-version at wikimedia, see e.g. https://commons.wikimedia.org/wiki/File:Question_Mark_Icon_-_Blue_Box.svg , imho it is fixed in the current librsvg-version.

@Gilles Is it possible to try resvg and/or inkscape at https://commons.wikimedia.beta.wmflabs.org ? Where to propose a test for beta.wmflabs and how to get access ?

Where to begin?

First, I'll thank Ponor for his measurement data. A mean file size of 700 kB is disheartening; it is just too big; WMF probably does not want to serve 700 kB images. Gilles exposition is informative as always. Johannes spent significant time on his reply, and yes, Commons should be dealing with SVG files rather than Inkscape, Illustrator, or CorelDraw files.

WMF should start serving SVG files instead of always converting to PNG.

There are many reasons for serving SVG files directly.

A. Better fidelity.

When MW started accepting SVG files, there was not good SVG support in browsers but there was good PNG support. Browsers have advanced, and current browser support is probably much better than librsvg. For example, browsers support textPath but librsvg does not. Modern browsers need to offer support around the world, so they have paid more attention to BiDi and to painting vertical Chinese characters. 'librsvg' messes up on BiDi and the vertical spacing of Chinese characters. Consequently, modern browsers do a better job of rendering SVG.

B. Some SVG is more compact than the PNG.

I loaded https://en.wikipedia.org/wiki/Ionization_energy into my browser. That page renders the image https://commons.wikimedia.org/wiki/File:First_Ionization_Energy.svg at 350 pix with this URL:

https://upload.wikimedia.org/wikipedia/commons/thumb/1/1d/First_Ionization_Energy.svg/350px-First_Ionization_Energy.svg.png

The debugger states the response header content-length is 12543 bytes. A file that is 350 pixels by 140 pixels at 3 bytes/pixel is 147,000 bytes, so the PNG has a compression ratio of about 12:1.

The original SVG file is 36 kB. However, that file is transferred from Commons with GZIP compression, so the network transfer is only 6357 bytes -- about half the PNG file used in the article.

Ponor's https://commons.wikimedia.org/wiki/File:Scanning_tunneling_microscope_-_tip,_barrier_and_sample_wave_functions.svg is only 21 kB. I do not think it will compress as much; a 300-px PNG image was 12979 bytes.

There are many large SVG files on Commons. Given network bandwidth issues, it would be better for a large files to be rendered and cached on the server.

For example, https://en.wikipedia.org/wiki/File:Gibraltar_map-en-edit2.svg is 1.46 MB.

The 290-px PNG https://upload.wikimedia.org/wikipedia/commons/thumb/0/06/Gibraltar_map-en-edit2.svg/290px-Gibraltar_map-en-edit2.svg.png is only 79 kB.

C. librsvg has long term bugs and recent changes will probably break MW.

IIRC, the original 'librsvg` developer went on to other projects, so the code was static for several years. A couple years ago, some other developers picked it up, but their efforts included converting the C++ code to Rust. The Rust conversion blocked WMF from updating the code on its servers and getting the benefit of recent bug fixes. There has been no progress on significant issues such as textPath and small characterr pixel quantization. T154237 presents further problems. the new version of librsvg wants a locale string (e.g., "en_US") rather than a langtag (e.g., "en-US"). I suspect that implies trouble for Chinese languages and WMF's non-compliant "sr-EC" and "sr-EL" langtags.

We could localize the SVG before handing it to the renderer, but that would not get around the other rendering issues.

D. Problems with librsvg rendering lead many editors to convert text to curves

Many editors draw a nice graphic on their machine, upload it to Commons, and see a horrible result. They try some iterations, throw up their hands, and convert the text to curves. That bloats the file, makes it impossible to subsequently translate the file to another language.

E. No webfont support.

It is technically possible for me to create an SVG file that uses an exotic font. For example, Google is developing a Siddham font for ancient Sanskrit. Using CSS, I can point to a URL definition of a font. Then the SVG will display properly on a modern browser even though the user never installed an exotic font on his machine. WMF currently disallows that technology because it blocks URLs in CSS and xml-stylesheet.

We can debate security issues (Googleapi fonts allow frequent user tracking even though the fonts have year-long cache times), but librsvg (and other contenders?) do not have extensive CSS support.

F. Serving SVG puts the computational burden on the user's device

Exotic effects such as Gaussian blur would be done on the user's device.

That can be a blessing, but it can also be a curse. SVG can become comples to render, so they can burden the user's device. WMF gets around that problem by using a rendering timeout.

Although directly served SVG would obviate the server doing the rendering, it may have an inordinate cost in server network bandwidth. SVG files are often inordinately large. Most Inkscape users output files that have lots of redundant style information. GZIP may compress a lot of that, but the redundant information should not be in the file in the first place. Inkscape also chooses an unusual grid, so the coordinates in the file look like random numbers (127.5648 rather than 130).

G. Serving SVG would allow text selections, tool tips, and animations

Converting an SVG to a static PNG disables lots of wonderful SVG features. I cannot copy text out of a PNG. SVG files can provide built-in imagemap features: a rect with a title element will display a tooltip. MW animations are primarily GIF files or movie files. This 140 kB GIF animation of pi is very nice:

https://commons.wikimedia.org/wiki/File:Pi-unrolled-720.gif

It could also be done with a directly served SVG file. SVG files can also do animations with user interactions. One could single step a mechanical mechanism.

Path Forward

MW should directly serve small (say < 20 kB) SVG files that are flagged. A flag may be important because many SVG files have fixed width and height.

When the parser reads the page, it can check the file size and emit an HTML tag (object?) with a URL for a scalable SVG file. If the file is too big, then it emits the usual img with a PNG URL.

The image scaler will serve the cached SVG. If the SVG is modified and becomes bloated (e.g., > 20 kB), then the image scaler can substitute a small SVG file that says the page must be rebuilt.

Down the road, the image scaler may want to do language localizations to preserved the current systemLanguage semantics. (WMF may want to let the user agent control the rendered language, but that is not what MW does today.) An SVG file may be localized with an XSLT script similar to the PHP $lang variable being inserted to img URL and running librsvg with that language argument. The same script can strip width and height and make sure that viewBox exists.

@Glrx Hi. As the author of resvg, I would like to point out some limitations of serving SVG directly. Yes, browsers are great, but they are not perfect either. Both Chrome and Firefox have tons of issues with SVG rendering. Even the textPath feature you've mentioned is actually badly supported. The situation with filters is pretty bad in Chrome. And not as great with complex text either. No browser is supporting enable-background (deprecated in SVG 2 through).

As for the SVG interactivity, yes SVG animations are great, but badly supported and basically non-exitent, mainly because there are no tools to create them (afaik). And the amount of people the can write such files by hand is rather small.

Yes, resvg is far from completion, but it's a viable alternative.

Is it possible to try resvg and/or inkscape at https://commons.wikimedia.beta.wmflabs.org ? Where to propose a test for beta.wmflabs and how to get access ?

Could that be discussed in a separate ticket, please? This one has already become an unhandable catch-all. :-/

My recollection of why we don't serve user-submitted SVGs directly as thumbnails is that the last time this was looked at there was no robust and up-to-date FLOSS SVG sanitiser that could ensure that the SVGs were safe to display directly in the browser.

XML is notoriously hard to sanitise and there are new tricks invented regularly to bypass sanitisation. Essentially, we don't want to deal with the possibility of a badly intentioned actor being able to inject a tracking URL inside an SVG that would let them collect IP addresses of anyone viewing that image in an article, run some arbitrary javascript, or worse, being able to leverage a browser security flaw in SVG parsing.

Furthermore, we would still need to have fallbacks for browsers that either don't render SVG natively or do a terrible job at it.

My current understanding is that WMF already does some sanitization of SVG files.

SVG with suspicious DTD subsets are rejected. There was an interesting DTD injection attack. I suspect only entity definitions are allowed. None of that seems to be a big issue. The modern view is that SVG files should not have DOCTYPE processing instructions. (Adobe Illustrator uses DTD subsets to define local entities.)

SVG with obvious Javascript/ECMAscript code is suppressed, I do not know the detection method. Presumably, all script elements could be forbidden and style elements restricted to type="text/css".

Presumably, event attributes such as onclick cause file suppression. Those attributes would allow arbitrary script injection. WMF could not rely on ECMAscript interpreters providing reasonable safety.

SMIL event attributes were allowed (e.g., begin, dur, and keytimes). IIRC, these attributes are declarative. (SMIL still has 98% support in browsers. See https://caniuse.com/svg-smil

I do not recall what MW does with a elements. Clicking on an element could take a user to a malicious site. However, allowing anchor elements that link to WMF sites could be a valuable feature. A diagram of an automobile could link to WP articles about tires, wheels, brakes, batteries, and engines. An image of a cell could link to DNA, mRNA, ribosome, and mitochondria.

SVG with xml-stylesheet processing instructions are rejected.

SVG with non-local URL references are rejected. For example, <use xlink:href="http://..." /> is problematic.

Embedded data URLs are limited to JPEG and PNG streams.

CSS with non-local URLs causes file rejection. For example, @font-face { font-family: foo; src: url(...); }. Webfonts present a tracking threat. I'd like to whitelist googleapi fonts, but they have short ET-phone-home cache times. Webfonts also present a DoS threat.

metadata elements with non-local URLs are allowed. RDF is declarative; Creative Commons RDF requires URLs to arbitrary URLs. SVG agents do not need to chase anything inside metadata elements to display an image.

SVG can do a DoS attack by describing a complicated image. Simple hierarchy can demand the painting of thousands of complex subimages. MW catches those files with a timeout. If we prohibit animations, then MW could verify files render in finite time by passing them through librsvg.

I do not know how the current SVG scanner operates. If the current level of protection is inadequate, then we could run the SVG through a transform that removes DTD subsets and keeps only whitelisted elements and attributes. Scanning CSS could be a problem.

What security issues does serving SVG present?

At this time, I am unsure whether SVG fallbacks are needed.

https://caniuse.com/?search=SVG

shows Opera Mini and UC Browser for Android (roughly 2% of global usage) as having unknown SVG support. Samsung Internet is also unknown at 0.72%, but a later version has SVG support. Unknown support may not be no support. SVG is significant enough now that I would expect minimal support in every browser.

I do not want to disenfranchise 2% of viewers, but the reason for this topic is that the current SVG to PNG conversion has significant problems now and will probably introduce further problems. Those problems affect not only accurate rendering of the SVG, but also frustrate content creators. SVG files that render on their local machine do not render correctly on WP pages. SVG files that should be easily modified cannot be because graphic artists have used librsvg workarounds.

Here is the big picture.

WMF has been serving converted SVG-converted-to-PNG files for years. That can still be a reasonable thing to do given that many SVG files on Commons are >400-kB monsters.

librsvg has been the image converter. The program has served WMF well, but it has significant problems. It's track record for fixing problems that are important to WMF has been very slow. It would be easy enough to continue using the current (old) version of librsvg. If we upgrade to a newer version, many MW code modifications may be required. Also, the new version may be incompatible with many switch-translated files. We may be stuck in the past. The newer version would not be a substantial upgrade.

A more recent resvg could be a better alternative. It seems that program is more faithful to the SVG specifications for those features it has implemented. Employing resvg would involve changing a few lines of MW code. It may offer substantial benefits. There are some downsides. I'm not sure how quickly issues would be fixed. Is it ready for prime time?

Starting to serve SVG files directly offers more features over static PNG files. Serving SVG can offer benefits that we do not yet appreciate. It requires more in-house work, but that should be a reasonable expense. There may be errors in rendering SVG files, but those errors are more diverse (one browser might do it right while another does it wrong), but the developer community for those browsers is larger, so the time-to-fix might be much shorter than the 6 years and counting for librsvg.

My current understanding is that WMF already does some sanitization of SVG files.

At upload time, yes. There are plenty of existing SVGs that predate some of the current checks though.

SVG with suspicious DTD subsets are rejected. There was an interesting DTD injection attack. I suspect only entity definitions are allowed. None of that seems to be a big issue. The modern view is that SVG files should not have DOCTYPE processing instructions. (Adobe Illustrator uses DTD subsets to define local entities.)

SVG with obvious Javascript/ECMAscript code is suppressed, I do not know the detection method. Presumably, all script elements could be forbidden and style elements restricted to type="text/css".

Presumably, event attributes such as onclick cause file suppression. Those attributes would allow arbitrary script injection. WMF could not rely on ECMAscript interpreters providing reasonable safety.

That is correct.

SMIL event attributes were allowed (e.g., begin, dur, and keytimes). IIRC, these attributes are declarative. (SMIL still has 98% support in browsers. See https://caniuse.com/svg-smil

I do not recall what MW does with a elements. Clicking on an element could take a user to a malicious site. However, allowing anchor elements that link to WMF sites could be a valuable feature. A diagram of an automobile could link to WP articles about tires, wheels, brakes, batteries, and engines. An image of a cell could link to DNA, mRNA, ribosome, and mitochondria.

Currently, they're ignored.

SVG with xml-stylesheet processing instructions are rejected.

SVG with non-local URL references are rejected. For example, <use xlink:href="http://..." /> is problematic.

Embedded data URLs are limited to JPEG and PNG streams.

CSS with non-local URLs causes file rejection. For example, @font-face { font-family: foo; src: url(...); }. Webfonts present a tracking threat. I'd like to whitelist googleapi fonts, but they have short ET-phone-home cache times. Webfonts also present a DoS threat.

Fonts will be the big problem. Many browsers will prevent fonts from being loaded for SVGs in <img> tags. Serving anything from a Wikimedia site that calls back to Google is a hard no.

metadata elements with non-local URLs are allowed. RDF is declarative; Creative Commons RDF requires URLs to arbitrary URLs. SVG agents do not need to chase anything inside metadata elements to display an image.

SVG can do a DoS attack by describing a complicated image. Simple hierarchy can demand the painting of thousands of complex subimages. MW catches those files with a timeout. If we prohibit animations, then MW could verify files render in finite time by passing them through librsvg.

I do not know how the current SVG scanner operates. If the current level of protection is inadequate, then we could run the SVG through a transform that removes DTD subsets and keeps only whitelisted elements and attributes. Scanning CSS could be a problem.

What security issues does serving SVG present?

It's the same as serving any arbitrary XML for a browser to render. We can't trust that the browser will handle malicious content for us, so we need to make sure that we aren't sending any. Doing so would require expanding the existing SVG checker in MediaWiki or finding one that someone else has built. Right now, the SVG security checker is mostly a "nice to have" minimal protection for those downloading SVGs and the software handling them. The actual security for most readers comes from the server-side rasterization in a restricted environment.

At this time, I am unsure whether SVG fallbacks are needed.

https://caniuse.com/?search=SVG

shows Opera Mini and UC Browser for Android (roughly 2% of global usage) as having unknown SVG support. Samsung Internet is also unknown at 0.72%, but a later version has SVG support. Unknown support may not be no support. SVG is significant enough now that I would expect minimal support in every browser.

I do not want to disenfranchise 2% of viewers, but the reason for this topic is that the current SVG to PNG conversion has significant problems now and will probably introduce further problems. Those problems affect not only accurate rendering of the SVG, but also frustrate content creators. SVG files that render on their local machine do not render correctly on WP pages. SVG files that should be easily modified cannot be because graphic artists have used librsvg workarounds.

SVG support across browsers is inconsistent. At least with server-side rasterization we all see the same bugs. Even targeting browsers directly, there will be "works fine for me" rendering problems. T134410: Evaluate SVG rendering compatibility in browsers has some initial thoughts, but this is something that would have to be thoroughly researched before any decision could be made.

Here is the big picture.

WMF has been serving converted SVG-converted-to-PNG files for years. That can still be a reasonable thing to do given that many SVG files on Commons are >400-kB monsters.

librsvg has been the image converter. The program has served WMF well, but it has significant problems. It's track record for fixing problems that are important to WMF has been very slow.

That's a strong statement, and I'm not sure it's entirely true. Most delays, at least right now, are blocked on deployment, not upstream development.

It would be easy enough to continue using the current (old) version of librsvg. If we upgrade to a newer version, many MW code modifications may be required. Also, the new version may be incompatible with many switch-translated files. We may be stuck in the past. The newer version would not be a substantial upgrade.

Backporting software because it is not available for the current OS version is one thing. Intentionally running old software when new, stable version are available in the repos is another, and it goes against good practice.

We'll get a new version of librsvg sooner rather than later, whenever T216815: Upgrade Thumbor to Buster gets done (probably before mid-2021).

I'll also remind you that client-side-rendered SVGs are 100% incompatible with language switching, as far as I'm aware. At best, they would render in the browser language, not the page language.

A more recent resvg could be a better alternative. It seems that program is more faithful to the SVG specifications for those features it has implemented. Employing resvg would involve changing a few lines of MW code. It may offer substantial benefits. There are some downsides. I'm not sure how quickly issues would be fixed. Is it ready for prime time?

I don't know. So far, no one has tested it against Commons SVG files. That needs to be done before there's any serious thought given to switching renderers.

resvg also depends on Rust, and we don't have Rust in Debian Stretch anyway (this is why librsvg hasn't been upgraded). So switching to resvg is also blocked on T216815.

Starting to serve SVG files directly offers more features over static PNG files. Serving SVG can offer benefits that we do not yet appreciate. It requires more in-house work, but that should be a reasonable expense. There may be errors in rendering SVG files, but those errors are more diverse (one browser might do it right while another does it wrong), but the developer community for those browsers is larger, so the time-to-fix might be much shorter than the 6 years and counting for librsvg.

That sort of discussion is largely outside the scope of this task and belongs more in T134410: Evaluate SVG rendering compatibility in browsers. There's significant groundwork required to get that anywhere close to working.

I'm sorry, I found this buried somewhere in my notes, that I was supposed to post this on Commons at some point, as a call for product management on a potential switch. Putting it here, as the RFC process winds down, just so it's not lost. But I think it's almost a year old at this point:

Dear Commons community, the WMF Technical Committee (link) is currently weighing an RFC (link to this one) which is basically about changing the SVG renderer. What this means is that some SVG that currently renders correctly will no longer render correctly. But that many of the current rendering issues could be solved. We think this could be a good idea, but would prefer if someone could organize what happens if/when images break. Without someone performing that product manager role, we think this switch would not be very successful. Does someone here want to take on that kind of work? Broken renderings must be identified, and if the renderer is at fault bug reports filed. If the file was at fault, the file must be fixed -- or at least organized into a list that the community can work from. We could provide a tool for testing individual images, but need people to run through them identifying what's actually wrong (if anything) when a file renders differently.

Dear Commons community, the WMF Technical Committee (link) is currently weighing an RFC (link to this one) which is basically about changing the SVG renderer. What this means is that some SVG that currently renders correctly will no longer render correctly. But that many of the current rendering issues could be solved. We think this could be a good idea, but would prefer if someone could organize what happens if/when images break. Without someone performing that product manager role, we think this switch would not be very successful. Does someone here want to take on that kind of work? Broken renderings must be identified, and if the renderer is at fault bug reports filed. If the file was at fault, the file must be fixed -- or at least organized into a list that the community can work from. We could provide a tool for testing individual images, but need people to run through them identifying what's actually wrong (if anything) when a file renders differently.

I think I would have enough technical knowledge to organize what happens if/when images break. I'm able to (a) categorize and identify them (e.g. render-bug or svg-bug), (b) answer user-questions, (c) write bug-reports, (d) make workarounds/replacements for many images. Basically which path to take is depending on the number of broken files and on the individual svg/bug . A more detailed answer can be found at Commons:User_talk:Milimetric_(WMF)

We think this could be a good idea, but would prefer if someone could organize what happens if/when images break. Without someone performing that product manager role, we think this switch would not be very successful. Does someone here want to take on that kind of work? Broken renderings must be identified, and if the renderer is at fault bug reports filed. If the file was at fault, the file must be fixed -- or at least organized into a list that the community can work from. We could provide a tool for testing individual images, but need people to run through them identifying what's actually wrong (if anything) when a file renders differently.

I'd be happy to help, too. Ideally you'd provide us with an svg (rendered by our browser), two pngs of the same size of the svg rendered by the two libraries and, if possible, also the two images one over the other, switching every 0.5 s or so, or their pixelwise difference so that major changes can be easily spotted (again, all done by our browsers with some javascript and css perhaps).

The above sound like very workable plans, thank you both for stepping up. To be clear, I can't coordinate this work, but hopefully as this goes through the new process we can find someone who can.

@RazrFalcon

IIRC, resvg uses a simple CSS parser. Could you try rendering

https://commons.wikimedia.org/wiki/File:SVG_CSS_Test.svg

AKA

https://upload.wikimedia.org/wikipedia/commons/3/37/SVG_CSS_Test.svg

It tests SVG 1.1 CSS selectors.

In T68551, F34144683 shows that librsvg 2.50 does all but the :lang() pseudo selector correctly.

Thanks.

@Glrx

resvg fails to load this CSS at all. I guess because of ?* CSS 3 selectors */. librsvg indeed uses a way better CSS parser.

@RazrFalcon

Thanks for trying it.

Please try it again; I've fixed the comment and added :first-child

CSS is not very important to Commons right now because librsvg 2.40 does not have much support. I see textPath and the resulting conversion of text to curves as a bigger problem, but all improvements are welcome.

It doesn't understand tspan[data-e~="a" i] either. Will fix it soon. a:link and :lang are ignored. The first one is a surprise, the second one is a known issue. Everything else works as expected.

SVG_CSS_Test.png (600×500 px, 14 KB)

I see textPath and the resulting conversion of text to curves as a bigger problem

What do you mean?

@RazrFalcon

Thanks for running the test. Resvg has more support than I expected.

Sometime soon I'll try some CSS @media support.

Sorry for the confusion. My comment is that librsvg 2.40's lack of support for textPath has caused more problems on Commons than that librsvg 2.40's lack of CSS support. It was not a criticism of resvg.

I also wonder about Inkscape's support for CSS style blocks.

Sometime soon I'll try some CSS @media support.

All @ sections are unsupported. In librsvg too, afaik.
!important and selector specificity are also not supported, but it should be fairly easy to implement. There are also no mixed-case CSS support.
In general, CSS is absurdly complicated and it's really hard to parse and apply it as long as it is not a browser.
librsvg uses CSS parser and selector from servo, which is a very heavy dependency, while resvg is trying to be slim.

As for textPath, resvg has probably the best support: https://razrfalcon.github.io/resvg-test-suite/svg-support-table.html#e-textPath

I also wonder about Inkscape's support for CSS style blocks.

It's fine: https://razrfalcon.github.io/resvg-test-suite/svg-support-table.html#e-style
But I don't have many CSS tests at the moment.

@RazrFalcon

Thanks for the note about @ sections. I'd like to see support for (color) and not (color), but that is beyond SVG 1.1.

It looks like Inkscape added better support of style elements around August 2017. It will only handle one style element (which probably means no support for media attributes). Looks like Inkscape has GUI support for the simple element, class, and id selectors and will display more exotic selectors. I should download Inkscape and try it out.

I'm making a svg benchmark, containing of three test suites

and the four render

Differently to older Benchmarks from 2006 and 2009, I also focused on bugs (not only on time).

SVGlibrsvg 2.50resvg 0.14.0Inkscape 1.0batik 1.13; 1.14
W3C correctness0,6620,8310,7450,801
W3C time13m 23.399s0m 42.104s22m 55.256s70m 16.007s
ReSVG correctness0.7540.9560.7290.703
ReSVG time4min 05sek2min 30sek46min 22sek61min 29sek
featured correctness0.921.00??
featured time5m 17,701s4m 46,639s15m 28,202s11m 30,768s

Decinding on a renderer is imho not only comparing numbers, it is also about knowing which bugs are occurring, how important they are, and if there exist an (easy) workaround.
You find on https://commons.wikimedia.org/wiki/User:JoKalliauer/SVG_test_suites all bugs (of librsvg and resvg), some workarounds, further interesting files, further alternative rendering engines, and on the talkpage some comments and points to discuss.

If we change renderer, I think this change should be done with the Upgrade Thumbor to Buster ( T216815 ).

The new version of librsvg is not an acceptable renderer:

  1. it does not take an IETF langtag (major)
  2. it does not handle textPath (medium)

The IETF langtag problem is growing. SVG Translate is injecting illegal langtags into SVG files: T271000. I suspect that confusion may have developed from librsvg 2.40 failing to handle hyphenated langtags correctly: T154237. SVG Translate's bogus langtags allow it to trick librsvg 2.40 into displaying Serbian in either Latin or Cyrillic scripts.

The new version of librsvg 2.44 wants a Unix locale string in the LANG environment variable instead of an IETF langtag. In simple terms, that means that MW would have to map IETF langtags to locale strings and then set the Unix LANG environment variable to that locale string. That is not a 1:1 mapping. It is not a problem for langtags such as en, de, or fr; it is a problem for sr-Latn, sr-Cyrl, zh_Hans, and zh_hant. And there is probably no way for non-IETF langtags such as sr-EC and sr-EL to survive the round trip.

Consequently, MW would have to localize the SVG file before handing it off to librsvg. That's not hard to do, MW should do it in the long run, but MW does not do it now.

MW cannot upgrade to a new version of librsvg; the mulitlingual SVG files will break.

The absence of even a limited textPath has always been an annoyance. IIRC, Firefox copped out by treating it as a text element; it didn't follow the path, but at least it provided the information. MW has lived without it for a long time, but that has had ugly consequences. Graphic artists convert their text to curves. Commons is a multilingual project, and curves are hard to translate. It is time to require some textPath support, and librsvg does not have it.

resvg will have its own problems, but it may be the only expedient. I presume it will be easy to pass the IETF langtag argument to it. There may be some rendering differences and problems, but that does not scare me given the mountain of workarounds librsvg has required. The vast majority of SVG files will be pedestrian and render the same. The only significant reservation is the resvg CSS parser.

I'd prefer that we could mark SVG files (say < 40 kB) to be served directly. I suspect that most modern browsers have sufficient SVG support. SVG files loaded into img elements will display title elements and animations while disabling scripts. Its 2021, and MW is using animated GIFs from 1987. (For consistent semantics, MW might localize the SVG file before serving it.)

@JoKalliauer Your timing results for Inkscape were a big surprise for me, and I ran all your tests in inkscape --shell on my little laptop.

114 files from Commons Featured collection rendered at 512px in 92 seconds. At least 75% of those were created by Inkscape, and no more than 15% by Illustrator.

1335 files from resvg collection rendered at 512px in 88 seconds (10 files entered some weird loops and inkscape exited; those were later excluded)

512 files from W3C collection rendered at 512px in 40 seconds (6 did not finish)

The actions pasted into the shell were like
file-open:Iowa_16_inch_Gun.svg; export-type:png; export-width:512px; export-do;
file-open:Pianino_-_mechanizm_angielski.svg; export-type:png; export-width:512px; export-do;
file-open:Flag-map_of_the_world.svg; export-type:png; export-width:512px; export-do;

I don't understand how you got those huge numbers.

@Glrx

I suspect that most modern browsers have sufficient SVG support.

Sadly, it's not true. Or at least as long as "browsers" is Chrome. Firefox has a pretty bad textPath support, while Chrome is pretty bad with effects.
Firefox has issues even with clipPath and doesn't support baseline-shift (aka subscript/superscript) at all.
And no browser support enable-background, albeit it was deprecated in SVG2.
Overall, browsers are way better than librsvg, but you still need workarounds for them. And more importantly, browser specific one.

The only significant reservation is the resvg CSS parser.

It's not perfect, but I wouldn't call it that bad.

I'd prefer that we could mark SVG files (say < 40 kB) to be served directly.

SVG size doesn't matter. Content does.

@Glrx

I suspect that most modern browsers have sufficient SVG support.

Sadly, it's not true. Or at least as long as "browsers" is Chrome. Firefox has a pretty bad textPath support, while Chrome is pretty bad with effects.
Firefox has issues even with clipPath and doesn't support baseline-shift (aka subscript/superscript) at all.
And no browser support enable-background, albeit it was deprecated in SVG2.
Overall, browsers are way better than librsvg, but you still need workarounds for them. And more importantly, browser specific one.

That "browsers are way better than librsvg" is sort of the point. Most of the SVG on Commons is pedestrian because librsvg is limited; textPath sometimes slips in for a big map, but otherwise it is only in little used files. I'm also less concerned with Chrome and Firefox because they do fix bugs (albeit on a timescale of about 6 months to a year) and they can probably be shamed by adding tests to https://www.caniuse.com.

Yes, baseline-shift is a problem, but the text is still readable. Furthermore, nobody gets it right: try doing e^(x^2).

I'm more concerned about support on Safari and other browsers. WMF has a diverse audience.

One proposal (below) is to serve only SVG that has been marked. There are a lot of SVG files that are trivial and should display reasonably on any browser.

The only significant reservation is the resvg CSS parser.

It's not perfect, but I wouldn't call it that bad.

JoKalliauer's tests showed good selector functionality, so I'm not too worried there. Many files on commons are done with Inkscape, and those files tend to not use CSS style elements (but heavily use style attributes). Illustrator overuses class, but its uses seem to be pedestrian. From what you've said, I just expect some CSS surprises.

I'd prefer that we could mark SVG files (say < 40 kB) to be served directly.

SVG size doesn't matter. Content does.

Size may be an issue for server bandwidth. Commons has lots of big SVG files.

The reality is WMF is not going to serve SVG any time soon, so that (partial) option is off the table. WMF cannot upgrade to the recent librsvg because that will break systemLanguage files. To use the recent librsvg, WMF will need to localize the SVG before passing it to librsvg. At the end of all of that, librsvg still has rendering issues; many bugs are fixed in the new release, but others still remain. That leaves two minimal-effort paths: limp along with the old librsvg or use a plug-in replacement for it. The replacement candidates appear to be resvg, inkscape, and batik.

Consider setting the SVG agent's IETF langtag preference(s).

  • librsvg uses the Unix $LANG environment variable, which is a locale string rather than a langtag. That's a problem. There is not an option for setting the langtag.
  • inkscape also uses the Unix $LANG environment variable. I do not see a command line argument that sets the langtag preference. Unless inkscape has some way to set the langtag (e.g., writing an options file), then it has the same locale string problem as librsvg.
  • batik has a -lang command line option that sets a langtag.
  • resvg has a --languages command line option that sets a list of langtags (no q value).

So systemLanguage constraints leave batik and resvg on the table.

There are a lot of SVG files that are trivial and should display reasonably on any browser.

This is actually a very good idea. Automatically analyzing SVG to detect what features they use should simplify the process.

From what you've said, I just expect some CSS surprises.

The main limitation is that CSS3/"CSS4" would not work. And mainly because of processing and not parsing. Something like CSS variables is pretty hard to implement.

The replacement candidates appear to be resvg, inkscape, and batik.

Inkscape and batik are far behind resvg. Especially performance-wise.

resvg has a --languages command line option that sets a list of langtags (no q value).

Yes, resvg doesn't care about $LANG, mainly because it's surprisingly hard to implement in a crossplatform and safe way. So it simply uses the value user provided.
On the other hand, resvg doesn't have a complete support for language tags. To do so you have to properly parse them and do some complex matching, so for now it simply matches the whole value.

@Glrx:
client side rendering; animated SVGs
I agree on that whitelisted SVG should be rendered on client side ( T5593 ), with opt-out or opt-in in the preferences.
Animated Gifs and Videos (e.g. webm) are imho still the golden html-standard. I hardly see animated svgs on the web, however I think thats also an advantage of client-side-rendering, since animated svg-converter are hardly known (imho e.g GPAC, and animated SVGs, animated GIFs and movies have a imho different scope.

CSS might be the biggest issue for resvg, however I read several bug-reports on help-pages often related to one of those (in reducing importance-order): T36947 T217990 T35245 T20463 T276684, and CSS is hardly mentioned (resvg has imho a better css-support than librsvg 2.40 which is currently used on commons). So the biggest downside is imho still a improvement. I know help-pages do not necessaryly represent importance: if you ask commons-svg-experts about the biggest current issue on commons imho most would say T11420 which agrees with the most common bug in the featured-test-suite, also it is hardly mentioned in questions on help-pages.
Generally for CSS-Problems you can easily make a workaround; often by just using https://svgworkaroundbot.toolforge.org/ . I know that CSS can be helpfull, but it is imho mostly used by SVG-experts, and I personally avoid it, and in e.g. inkscape you imho cannot add CSS. (I also find it confusing if the xml contains a different value than CSS, and one of them overwrites the other, depending on the priority-list.)

@Ponor:

I rerun the tests with 512px

librsvgresvgbatikInkscape (start per image)inkscape (run all in the same job)inkscape (remove two files)
time featured-collection (512px)4m 28,886s1m 15,307s10m 8,168s5m 9,164s2m 27,598s
time resvg-collection (512px)6m 13,054s2m 35,135s63m 36,648s38m 5,628s17m 8.889s2m 22.970s
time w3c-collection (512px)1m 46,776s1m 12,591s29m 46,446s21m 14,825s4m 13.46s
time 2006-MediaWiki-collection (512px)23.129s9.551s186.809s87.313s

Differences to you:

  • I start and exit inkscape (without gui) for every image (inkscape "$file" -w 512 --export-type="png"), which is (for inkscape) very time-consuming, it is imho as it is done currently on WMF-Servers, @Gilles I'm not shure if it is a good idea to keep inkscape-job open and run all images in the same process (e.g. if it hangs) .
  • I measure the CPU-time (and limited to one CPU), not the real wall-clock-time.
  • I excluded only one image Cone clutch.svg (featured) (Inkscape hangs), so in my case only one featured image fails in Inkscape (and no other), in your case 16 different images test-suite fails.

As discussed with you, you excluded images that fail after a long time. (Which makes sense but cause huge differences!)
Inkscape in the resvg-collection (my times)

  • with restarting for each image needs about 38minutes
  • without restarting 17minutes, and
  • with removing two images (as you did) 2minutes.

So how to limit maximum time (before success/chrash) should be depending on the time-out-limit, see T200866 as well as https://commons.wikimedia.org/wiki/User_talk:JoKalliauer/SVG_test_suites#time-out-limit

From what you've said, I just expect some CSS surprises.

The main limitation is that CSS3/"CSS4" would not work. And mainly because of processing and not parsing. Something like CSS variables is pretty hard to implement.

SVG 1.1 uses a subset of CSS2. The simple view is WMF only supports SVG 1.1, so CSS3/CSS4 are irrelevant. Even SVG 2.0 subsets CSS. IIRC, SVG 2.1 is toying with ::before and ::after psuedo selectors. Most graphics editors are going to output pedestrian SVG with no or trivial CSS.

Yes, resvg doesn't care about $LANG, mainly because it's surprisingly hard to implement in a crossplatform and safe way. So it simply uses the value user provided.

I wish Gnome understood that.

On the other hand, resvg doesn't have a complete support for language tags. To do so you have to properly parse them and do some complex matching, so for now it simply matches the whole value.

And that is where Gnome went astray. BCP 47 has several types of langtag matching. one can create langtags with * wildcards. If one looks at all that BCP 47 might imply, then one could believe that SVG needs complicated langtag matching and therefore should use somebody's BCP 47 langtag library. But that is not the case. SVG 1.0, 1.1, and 2.0 have all specified the "Basic Filtering" matching method. SVG did not adopt "Extended Filtering".

Adding HTTP Accept-Language with SMIL allowReorder processing takes less than half a page. It does not do complicated matching but rather scores each clause and keeps the best.

With a specific locale string of LANG=es_ES.utf8 (which is a transliteration of es-ES to a locale string), librsvg displays systemLanguage="es" text. It should only display text that is at least es-ES. See T261192#7053643

Incidentally we're just having a very similar discussion in the Inkscape project, and I believe I can clarify some things here:

Usage of system locale

Our current opinion is that usage of the system locale (for example $LANG variable of the form "de_DE.UTF-8" which holds a POSIX locale) is the most suitable thing to do for many applications:

  • The SVG spec simply states "Evaluates to "true" if one of the language tags indicated by user preferences is a case-insensitive match"
  • It does not say anything about how applications are supposed to enable the user to state their preferences.
  • Implementing something similar to how browsers allows users to set Accept-Language to "arbitrary" values, certainly is *one* way to go but is likely to be overkill for most applications.
  • Considering the system locale therefore is the most obvious way to derive the user's preferences.
  • If for example the user prefers "es_ES" locale it's only reasonable to present them with "es-ES,es" (in that order).

Overriding system locale

The observation above ("inkscape also uses the Unix $LANG environment variable") is not wrong but actually only captures a small part of what Inkscape uses. In fact Inkscape considers *all* locale-related environment variables, i.e. LANGUAGE, LC_ALL, LC_MESSAGES and LANG (and possibly even other native locale indicators depending on OS). For the inclined reader: Inkscape internally uses glib's g_get_language_names() for this.

The environment variable most people will want to use to override the system locale is therefore the much more suitable LANGUAGE, which even accepts a list of languages like LANGUAGE=es_ES:en and would only match "es-ES" in that case but not "es".

Unfortunately I believe MediaWiki currently does not allow setting LANGUAGE.

In any case we'd be interested if it made sense for potential users (and I count MediaWiki here) if Inkscape offered a command line option to allow specifying the "language preference" explicitly (it could always be added if the environment variables are not sufficient).

allowReorder

First of all, note allowReorder is not part of any version of the SVG specification.

Inkscape currently renders according to SVG 1.1 spec. In this version of the spec the first matching object in the <switch> is rendered, even if it is not the "most preferable" language for the user).

SVG2 changed the spec: A <switch> is now always rendered as if the allowReorder attribute, defined in the SMIL specification, was set to 'yes'. The allowReorder attribute itself is still not part of the spec, though.

Incidentally we're just having a very similar discussion in the Inkscape project, and I believe I can clarify some things here:

Usage of system locale

Our current opinion is that usage of the system locale (for example $LANG variable of the form "de_DE.UTF-8" which holds a POSIX locale) is the most suitable thing to do for many applications:

  • The SVG spec simply states "Evaluates to "true" if one of the language tags indicated by user preferences is a case-insensitive match"
  • It does not say anything about how applications are supposed to enable the user to state their preferences.
  • Implementing something similar to how browsers allows users to set Accept-Language to "arbitrary" values, certainly is *one* way to go but is likely to be overkill for most applications.
  • Considering the system locale therefore is the most obvious way to derive the user's preferences.
  • If for example the user prefers "es_ES" locale it's only reasonable to present them with "es-ES,es" (in that order).

It is not clear what your goal is. Locale may be "the most suitable thing to do for many applications," but that is not what is being discussed here. SVG agents need to be able to set their language preference much like the HTTP Accept-Languages header sets preferences. If I want to set the SVG language preference to es-ES, then I do not want any processing such as "If for example the user prefers "es_ES" locale it's only reasonable to present them with "es-ES,es" (in that order)."

It is not a problem if an SVG agent at start up guesses that the user's default locale es_ES implies something like LANGUAGE=es_ES,es. That's a reasonable guess. But the current problem is whether librsvg can be told an explicit preference without garbling or extending it. There is a huge type conflict: librsvg wants locale string types and WMF wants to specify IETF langtag types. (Well, WMF is even confused there because there are also Wiki language tags that are different from IETF langtags.)

Overriding system locale

The observation above ("inkscape also uses the Unix $LANG environment variable") is not wrong but actually only captures a small part of what Inkscape uses. In fact Inkscape considers *all* locale-related environment variables, i.e. LANGUAGE, LC_ALL, LC_MESSAGES and LANG (and possibly even other native locale indicators depending on OS). For the inclined reader: Inkscape internally uses glib's g_get_language_names() for this.

The environment variable most people will want to use to override the system locale is therefore the much more suitable LANGUAGE, which even accepts a list of languages like LANGUAGE=es_ES:en and would only match "es-ES" in that case but not "es".

Unfortunately I believe MediaWiki currently does not allow setting LANGUAGE.

In any case we'd be interested if it made sense for potential users (and I count MediaWiki here) if Inkscape offered a command line option to allow specifying the "language preference" explicitly (it could always be added if the environment variables are not sufficient).

MediaWiki software currently only passes its PHP $lang argument through the LANG environment variable. That was done to make librsvg work. it is trivial to set other environment variables in the PHP rasterize() method. Setting an environment variable(s) is not the correct method for resvg and probably also incorrect for batik (both of which take command line arguments).

In the larger sense, setting the locale is the wrong thing to do. Say the program discovers an error and want to log that error. To me, it should write the error in the system's language (say English) rather than the Chinese that it might be processing at the moment. I expect server logs to be in the local language and be independent of any language a user may have requested.

In Inkscape's case, imagine an English-speaking graphic artist who wants to look at the systemLanguage="zh-Hant" text to see if the spacing is OK. Does he really want his whole user interface turned into Chinese? Or does he just want the graphic to display in Chinese?

The preferred method for WMF is a command line argument that looks exactly like Accept-Languages. Initially, WMF would only use one langtag, but its langtags can be nonstandard (e.g., als, sr-EC, sr-EL, zh-Hans, and zh-Hant). Non-standard langtags probably do not survive a trip through locale string processing.

Interestingly, the librsvg locale code can take IETF langtags and possibly even an Accept-Languages string, so it should be easy for librsvg to add a command line argument. Gnome has an issue number for it.

allowReorder

First of all, note allowReorder is not part of any version of the SVG specification.

Inkscape currently renders according to SVG 1.1 spec. In this version of the spec the first matching object in the <switch> is rendered, even if it is not the "most preferable" language for the user).

SVG2 changed the spec: A <switch> is now always rendered as if the allowReorder attribute, defined in the SMIL specification, was set to 'yes'. The allowReorder attribute itself is still not part of the spec, though.

allowReorder processing snuck in during SVG 1.1. IIRC, Firefox had and obeyed the allowReorder attribute, but it did the langtag processing wrong. SVG 2.0 never had the attribute, but early versions made allowReorder processing optional.

OK, sorry for trying to clarify and help. Got the message, will keep out of the discussion going forward again.

@Patrick87

I for one found your insights useful, and would like you to feel encouraged to participate in this and other tickets of your interest.

From the Discussion on Inkscape about getting the Wikipedia-renderer

Martin Owens (the owner of the inkscape project) raised the question, if we would like SVG 2.0-Support, which is an inofficial Draft. Side-note Validators would call SVG 2.0-Files in most cases invalid, even if it does not influence rendering. (But validity is imho not something to aim for.).

That's according to Owens a bit a political question, if Wikimedia supports SVG 2.0-files, it is more likely that more renderer support SVG 2.0 and it won't "end up being just another Inkscape SVG format" (like SVG 1.2, which imho will never release).

Owens wrote that Inkscape is primary an editor for SVG documents, and only secondly a SVG generator for browsers. So it supports inkscape-features which are neither in the SVG 1.1 nor in the SVG 2.0 DTD . For example Inkscape uses <sodipodi:namedview pagecolor="#ffffff" inkscape:pageopacity="1"/> for creating a white background, but only <rect width="100%" height="100%" fill="#ffffff" sodipodi:insensitive="true"/> or <circle r="1e4" fill="#ffffff" sodipodi:insensitive="true"/> are supported by browser/render. That is imho a good feature for SVG-editors, but maybe not for SVG-Renderer.

I see the attitude of Wikimedians that we want SVG-Files, that are editable by any software, and not Inkscape-files, that's a reason, why we require free-licenses even for file-formats (e.g. don't allow *.mp4). So I personally see Inkscape as svg-render problematic, because it supports features that are not defined by any SVG DTD (neither in SVG1.1 nor in SVG2.0) (knowingly that Inkscape is under a free license).

Currently of about 100 broken files on Commons by librsvg about 2 files contain rendering-relevant SVG 2.0, so the importance on Commons for SVG 2.0 is currently negligible (notice even under the broken ones, and librsvg 2.40. imho does not support any SVG 2.0). The support of SVG 2.0 is imho not something to aim for, since we should stick to the current SVG1.1 standart to have a clear, unique rendering (knowingly that browsers have some SVG 2.0-support). The support of SVG 2.0 is imho not something bad, knowingly that supporting this features is according to the current SVG 1.1 Standard stricly speaking imho wrong (i.e. bug).

I think Inkscape sounds good for Inkscape-Editors, but I think it is optimized for creating files not for rendering. (But still it is a good choise, imho at least compared to librsvg.)

I do not care (that much) if we change to resvg or inkscape (or similar), however to stick to librsvg (even the current version is too buggy) just because we do not know how to decide is imho the wrong way to go.

I do not know if headless browsers would also be a suiteable solution.

My recommendation: Maybe mass-svg-rendering should be done by resvg (fast, SVG 1.1) and having a flag that specific images (e.g. with inofficial inkscape-features) get rendered by inkscape. [Iff we want Inkscape-files (invalid SVG), that can only be rendered/edited by Inkscape.]

OK, sorry for trying to clarify and help. Got the message, will keep out of the discussion going forward again.

Please don't. I 'll echo @Krinkle here. I found your input useful and I learned some things from your comment, which I found very down to earth and clarifying.

It is not clear what your goal is.

Hello, Keep CoC and Phabricator etiquette in mind. Thank you.

Hello, Keep CoC and Phabricator etiquette in mind. Thank you.

In the Wikiversum: New anonymous newbies have the ability to change content without understanding Wikipedia, so many experienced users in the Commuinty are used to speak quite straight(Not something I like.), otherwise the Wiki get crowded by wrong edits. I think that's a reason why many Newbies find Wikipedia-Discussions generally quite harsh, also we maybe have similare rules in :w:en:Wikipedia:Etiquette and :w:en:Wikipedia:No_personal_attacks. So I would not overinterpret the tone of Glrx (Commons-Community-member).

I find Patrick87's and Glrx comments useful, I hope you both keep in the discussion.


Coming back to the topic:

What I understood from todays discussion T283083 is that this evaluation is stuck till Thumbor got upgraded T216815 , before that we/WMF can't do "anything" .

as far as I understood @AntiCompositeNumber: The upgrade is planed this summer , and as I understood, it might depend on the workload of @Gilles when he is able to work on Thumbor , if thumbor can be upgraded this summer or later.

Is using an :w:en:AppImage, e.g. as provided by Inkscape, a possible solution? That would make the renderer-version independent on the Debian-version or any libary-version. As far as I know AppImages (single-executable-file, distribution-independend) are portable and can be run on any Linux-system without any prerequisites, without installing (without root-permission).

So for Inkscape we could use the latest release, the latest development-version, any older release or even all of them allongside (e.g. to avoid regression-bugs), independent when/if we update Thumbor .

librsvg repo haas been disabled and doesn't support node v12+ (https://github.com/2gis/node-rsvg/tree/0.7.0). I see we could switch to puppeteer. E.g. https://github.com/etienne-martin/svg-to-img as a replacement?

(Debian bullseye uses nodejs 12.22.5).

Even the repo it says to use hasn't received an update since 2019...

librsvg repo haas been disabled and doesn't support node v12+ (https://github.com/2gis/node-rsvg/tree/0.7.0). I see we could switch to puppeteer. E.g. https://github.com/etienne-martin/svg-to-img as a replacement?

(Debian bullseye uses nodejs 12.22.5).

Even the repo it says to use hasn't received an update since 2019...

We don't use that. Thumbor is written in Python (2, we know), but we shell out to rsvg-convert anyway. Librsvg is written mostly in Rust now, but the version currently in production is still C. Upstream is https://gitlab.gnome.org/GNOME/librsvg, packaged as https://packages.debian.org/stretch/librsvg2-bin.

librsvg repo haas been disabled and doesn't support node v12+ (https://github.com/2gis/node-rsvg/tree/0.7.0). I see we could switch to puppeteer. E.g. https://github.com/etienne-martin/svg-to-img as a replacement?

(Debian bullseye uses nodejs 12.22.5).

Even the repo it says to use hasn't received an update since 2019...

We don't use that. Thumbor is written in Python (2, we know), but we shell out to rsvg-convert anyway. Librsvg is written mostly in Rust now, but the version currently in production is still C. Upstream is https://gitlab.gnome.org/GNOME/librsvg, packaged as https://packages.debian.org/stretch/librsvg2-bin.

Mathoid uses it though. I presumed that's what this task was about.

librsvg repo haas been disabled

No it has not been. That's an unrelated repo.

Mathoid uses it though. I presumed that's what this task was about.

Indeed. Please file a different task though tagged Mathoid and Platform Engineering and subscribe @Physikerwelt as well. If a dependency is abandoned, a replacement needs to be found/written. Otherwise we 'll eventually need to disable that functionality

Mathoid uses it though. I presumed that's what this task was about.

Indeed. Please file a different task though tagged Mathoid and Platform Engineering and subscribe @Physikerwelt as well. If a dependency is abandoned, a replacement needs to be found/written. Otherwise we 'll eventually need to disable that functionality

T247697: Rethink mathoids SVG to PNG conversion already exists as a subtask of this one, which may have been the cause of the confusion. I'm not sure it really should be, as Mathoid is self-contained (producing both the SVG and PNG output) and doesn't have to use the general-purpose renderer.

Mathoid uses it though. I presumed that's what this task was about.

Indeed. Please file a different task though tagged Mathoid and Platform Engineering and subscribe @Physikerwelt as well. If a dependency is abandoned, a replacement needs to be found/written. Otherwise we 'll eventually need to disable that functionality

T247697: Rethink mathoids SVG to PNG conversion already exists as a subtask of this one, which may have been the cause of the confusion.

I see, my mistake I missed that. Thanks for pointing it out. I did leave some comments on that task. Let's not hijack this task more for Mathoid's SVG functionality.