Page MenuHomePhabricator

Add support for inline SVG
Open, Needs TriagePublicFeature

Assigned To
None
Authored By
John_M_Wolfson
Apr 10 2023, 5:39 AM
Referenced Files
None
Tokens
"Like" token, awarded by ShakespeareFan00."Like" token, awarded by Sjoerddebruin."Love" token, awarded by Bawolff."Like" token, awarded by Bouzinac.

Description

Feature summary (what you would like to be able to do and where):
SVG, or at least some degree of HTML graphics, should be directly supported in wikitext. For example, Inside of having to use a thumbnail for a pre-made image of an SVG circle, it should be possible to directly insert the circle into wikitext, like so:

This is a circle

<svg aria-labelledby="The following is a black circle">
<circle cx = "50" cy = "25" r="50/>
</svg>

This is consistent with wikitext support for inline HTML and CSS, and would be primarily useful in templates and modules, as will be discussed below.

Use case(s) (list the steps that you performed to discover that problem, and describe the actual underlying problem which you want to solve. Do not describe only a solution):

Many vector graphics, such as https://commons.wikimedia.org/wiki/File:44th_Canadian_Parliament.svg and https://commons.wikimedia.org/wiki/File:Eleventh_Jatiya_Sangsad.svg, would be best off templatized in some form, rather than having to create and upload images from scratch.

The main problem an attempted templatization currently runs into is the fact that it could only output raw SVG, which is not currently supported. As such, in order to create such diagams or maps, one would have to use a text editor or Inkscape to either create a fresh graphic or recolor a previously-uploaded graphic, and upload it fresh to Commons, and then link it from Commons. With a template (and, by extension, inline SVG), one could cut out such a middleman and make the maps directly in the source material, similar to what is already done on enwiki with Template:Graph:Chart.

Benefits (why should this be implemented?):
As largely described above, this would save users much work in having to create raw maps of niche subjects and "polluting" Commons with such maps; it would also, much like Graphs do currently, save users from a thumbnail and allow for direct insertion of graphics into prose.

I am aware the power this gives people might be too great and am fine with reasonable nerfing, perhaps by not allowing other namespaces such as xlink. Also, viewboxes would have to be sorted. I do, however, believe that the benefits greatly outweigh the costs and am slightly surprised that this does not already exist in some form.

See also:
https://www.mediawiki.org/wiki/Inline_SVG_use

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

This is very unlikely to ever be implemented. Up to a few years ago, 90% of our security problems were coming from just SVG uploads. Embedding raw XML WITH javascript that is originating from untrusted user is a gigantic additional surface for potential security issues, even with a sanitiser (it's difficult to know about the things you are NOT covering with a sanitizer).

Then maybe continue generating PNGs, just allow the SVG source to be defined inline in wikitext instead of a file on disk? Much like how Math and Score work today: input raw LaTeX/Lilypond in a parser tag (in our case <svg>), and have it rendered on the page. This is no more dangerous than uploading files, just more convenient. Of course, the checks that are run on uploading SVGs should also run when parsing the page, rejecting bad SVGs with an error message.

Then maybe continue generating PNGs, just allow the SVG source to be defined inline in wikitext instead of a file on disk? Much like how Math and Score work today: input raw LaTeX/Lilypond in a parser tag (in our case <svg>), and have it rendered on the page. This is no more dangerous than uploading files, just more convenient. Of course, the checks that are run on uploading SVGs should also run when parsing the page, rejecting bad SVGs with an error message.

This would be more likely/feasible, though uploads are hosted on a separate domain with a separate security context, that is not really reachable from wikitext. To do the same for this, would probably require putting a definition into a separate namespace ([[Drawing:mygraph.svg]]) and contentmodel and then be transcluded as a template without parameters. This would be needed to do sanitization, async rendering, caching and cache invalidation and versioning.

This would be more likely/feasible, though uploads are hosted on a separate domain with a separate security context, that is not really reachable from wikitext.

Score also uses the separate domain: e.g. https://hu.wikipedia.org/wiki/A_szegedi_halast%C3%B3 transcludes the score from https://upload.wikimedia.org/score/p/d/pdkkh10jf6u8nadj3mf9mv0k48vssro/pdkkh10j.png. (Probably Math does the same, I haven’t checked.) Although I don’t really see the security impact – we’re talking about rendered PNGs, not SVGs.

To do the same for this, would probably require putting a definition into a separate namespace ([[Drawing:mygraph.svg]]) and contentmodel and then be transcluded as a template without parameters. This would be needed to do sanitization, async rendering, caching and cache invalidation and versioning.

Not allowing parameters would likely mostly stop the feature from being used – often data is readily available in Wikidata, no one wants to maintain separate Drawing-namespace pages for each legislature we have election apportionments for, municipalities we have population data for, and so on. Also, if Score and Math can do it without requiring a separate namespace, why can’t SVG do it?

Implementing this would provide a relatively simple solution for T334940.

Regarding the Javascript question:

  • I think that allowing arbitrary javascript from users is definitely a bad idea.
  • What we could do is to provide a curated set of scripts (admin controlled) that SVG files could reference in their implementation. Any other javascript code would be stripped by the sanitizer. (Specifying how to identify "legal" calls is still difficult, and might require non-JS syntax that would allow calling existing libraries.) This could also then be applicable to all SVGs uploaded to Commons. (Is Javascript currently stripped from uploaded SVGs?)
  • Note that in some cases CSS might be sufficient for rendering animations and interactivity. For an example see this SVG which is directly linked from the Low Earth Orbit article.

How I foresee this working (with my limited working knowledge of HTML and CSS, my HTML assumptions may not apply to SVG):

  • The sanitizer strips all <script> tags from the generated SVG and adds an entry to an "SVG Management" script.
  • The SVG management script would then be in charge of importing any authorized scripts as needed.
  • All Javascript "function calls" to SVG management and authorized scripts would involve having the management script create custom HTML tags where the attributes would act as function calls.
  • If possible, custom attributes could be added to existing tags in order to inject event handlers.

This approach should hopefully resolve the Javascript security question by basically whitelisting tightly-controlled code while excluding arbitrary code. I would guess that the need for this code might be minimal and CSS might be able to handle the bulk of interactivity.

Maybe I am missing something, but why we can not just have SVG without any javascript/css/iterativity, just static SVG? If the javascript is the problem, lets remove it and keep only the SVG.

Static SVG would be an significant advance. Currently we can create bar charts and anything that use only rectangles in wikipages using html <div> tags and inline css, but we can not create line charts, pie charts or anything that needs shapes different from rectangles and vertical and horizontal lines. Maybe we can create some shapes with some css tricks in templatestyles, but it not practical and not as easy as SVG.

The only problem I can think in having static SVG is the size, if someone for example try to create a world map with all countries the size of the SVG can be very large. But it is simple to limit the size, if the size of all SVGs in some page exceeds some predetermined size, the MediaWiki could show an error and not show the SVG. It could avoid someone to create complex shapes like a world map, but would allow simpler shapes like line and pie charts.

So, my proposal is allow static and limited size SVG, without any javascript or css that is not already allowed in wikipages. And let more complex implementations for future discussions and developments.

SD0001 subscribed.

Removing subtask as the inline SVG can be server-rendered. It doesn't need to render on the client and so doesn't need any more validation than we do today.

I used to be opposed to this, but I am now more a fan.

I think we should allow a small whitelist of useful SVG elements. We can keep it small to start with of just very useful things, and not allow anything that is complex or scary. Unlike with uploads where we basically have to support anything, we can limit this to a small subset of primitives.

Even if we just support <g> and <path> as a starting place, that opens up a bunch of potential applications. Basically I am advocating that we treat it like HTML, where we whitelist tags/attributes on a case by case basis.

I think SVG snippets + declarative animation via TemplateStyles has the potential to make very interesting interfaces on wiki, including applications similar to the old graph extensions.


Removing subtask as the inline SVG can be server-rendered. It doesn't need to render on the client and so doesn't need any more validation than we do today.

There are 3 separate versions of this:

  • render server side to a png
  • render client side in an <img> tag (Either data: url, or reference the svg)
  • output the SVG elements directly in the HTML

The first 2 are the safest, but the last one opens up the most applications and is by far the most useful.

The first 2 are the safest, but the last one opens up the most applications and is by far the most useful.

Hmm, I wonder if we should split that to a another task. Server-side seems straightforward and uncontroversial, requiring few changes. It's admittedly not as useful but should be a net positive over what we have today. I would have raised a patch already but I just don't have the contiguous free time.

Why would be the third one “by far” the most useful? I’d think most use cases can be satisfied by either of the first two, especially if TemplateStyles is allowed within the SVGs. Of course, there are some cases that can be satisfied only by the third option (say, using CSS variables defined in wikitext), but they aren’t that important.

Why would be the third one “by far” the most useful? I’d think most use cases can be satisfied by either of the first two, especially if TemplateStyles is allowed within the SVGs. Of course, there are some cases that can be satisfied only by the third option (say, using CSS variables defined in wikitext), but they aren’t that important.

I think because then you can interlace the SVG paths with other web elements, including wikitext. This also allows links, tooltips and hover effects. I feel like it also makes positioning the stuff in an integrated way easier.

I don’t imagine these advanced things being used much. While the third option would be by far the most powerful, I don’t it would be by far the most useful.

I don’t imagine these advanced things being used much. While the third option would be by far the most powerful, I don’t it would be by far the most useful.

You could do scripts (think animations) when an SVG is embedded in HTML... And that is also why that would be most dangerous. Animations and onhover/onlick would be quite useful in so many ways. You could create simple graphs or even advanced graphs with enough time and Lua expertise... Also I don't see this happing (Stux mentioned this above - users should not be able to freely craft JS executed by everybody).

Using SVG in img tag is safe as it doesn't run script, but it also doesn't play animations. This seems like the easiest way to go though. You don't need any sanitization of inner scripts, the browsers already do that.

BTW. Might be worth dropping rendering SVG as PNG from Commons too. That way we would get rid of problems like T358438 (which are more likely to be solved in browsers then in Commons/Mediawiki).

SVG is like HTML - there are dangerous parts but its not that big a deal. We allow <div> in wikitext but not <script>, we could do the same thing with svg potentially.

You could do scripts (think animations) when an SVG is embedded in HTML... And that is also why that would be most dangerous

Sure, but that is not the only thing. Think also links, selecting/highlighting text, a14n concerns, procedural (css) animations, css ( :hover ) style mouseover stuff. Additionally it can be easier to position with wikitext content then images are, if you are making interface elements.

Obviously scripts would be even more powerful (and much more dangerous). I think there are some arguments for it (i did some experiments with Extension:monstranto), but there are middle grounds here that are much less dangerous.

A silly question. Apologies if it was already mentioned before and I missed it.
Where exactly will be the license? And how exactly it will be checked, and by whom, to avoid copypasting inline some copyrighted svg file, such as a company logo?

This comment was removed by IKhitron.

Inline SVG, being part of the wikitext, should be CC BY-SA 4.0, just like other parts of the wikitext. It’s mainly useful for graphs that visualize data specified in template parameters or Wikidata; it shouldn’t replace SVG uploads. Logos should continue to be uploaded as before, and logos included inline should be treated like any other form of copyright violation. (I think it will mainly be used through meta-templates and meta-modules, not directly, so a simple search for inline:svg inline:/\<svg/ will have very few false positives. So checking can happen through for example simple searches, bots, AbuseFilter or Toolforge tools.)

Yep. Exactly. There are a lot of people that upload files with wrong licenses onto local wikis and Commons. There are special forces which are supposed to stop them, VRT. There will be the same count, or maybe even more, people that will do it inline. Who will recognize and stop them? Or the VRT groups are supposed to do much more work? And to learn how to do this at all?

I don’t think there will be the same count inline: first, I don’t imagine this feature to be as obvious as the file upload button/link; and second, it can be used only by people who know how to copy the source code of an SVG (or know that an SVG has a source code at all, i.e. that it’s a textual format).

And as I wrote, I don’t imagine this being used directly a lot, so every time <svg appears in the wikitext of a page, that’s a red flag. This means that automated tools (whether AbuseFilter, a bot or a Toolforge tool) can create backlogs with very few false positives (or false positives hardcoded to be ignored). Based on the backlogs, people who usually fight copyright violations (administrators, patrollers etc.) can decide what to do. But even without automated tools, just running a search every now and then can make the backlog manageable (what “every now and then” means, of course, depends on the actual amount of such vandalism on the given wiki).

(By the way, I don’t think it’s VRT’s job to stop copyright violation: VRT’s job is to handle incoming permissions, which requires access to VRTS. Discovering files without permission doesn’t require access to VRTS.)

I worked on this during the hackathon and came up with a hacky PoC. Will upload it on gerrit in a while.

Change #1043111 had a related patch set uploaded (by SD0001; author: SD0001):

[mediawiki/core@master] [WIP] Support for inline SVG in wikitext

https://gerrit.wikimedia.org/r/1043111

Have raised the basic WIP patch which embeds the rasterized PNG as a base64 blob in a data:image URL. I could use some help in figuring out how to store the png in the file repo and have the parser generate just a link to that.

Test wiki created on Patch demo by SD0001 using patch(es) linked to this task:
https://patchdemo.wmflabs.org/wikis/d1a455521c/w

Patch demo doesn't seem to have imagemagick installed. So it doesn't work over there.

I would suggest making this an extension. I think this would be a very hard sell as a mw core feature. (Of course, being realistic, its still going to be a hard sell getting the extension deployed to WMF)

Change #1043217 had a related patch set uploaded (by Eccenux; author: Eccenux):

[mediawiki/core@master] PoC: Embeded to inline SVG

https://gerrit.wikimedia.org/r/1043217

Change #1043223 had a related patch set uploaded (by Eccenux; author: Eccenux):

[mediawiki/core@master] PoC: Embeded to inline SVG

https://gerrit.wikimedia.org/r/1043223

Change #1043217 abandoned by Eccenux:

[mediawiki/core@master] PoC: Embeded to inline SVG

Reason:

https://gerrit.wikimedia.org/r/1043217

Change #1043240 had a related patch set uploaded (by Eccenux; author: Eccenux):

[mediawiki/core@master] PoC: embedded to inline SVG

https://gerrit.wikimedia.org/r/1043240

Change #1043223 abandoned by Eccenux:

[mediawiki/core@master] PoC: Embeded to inline SVG

Reason:

Doesn't work

https://gerrit.wikimedia.org/r/1043223

Change #1043243 had a related patch set uploaded (by Eccenux; author: Eccenux):

[mediawiki/core@master] PoC: embedded to inline SVG

https://gerrit.wikimedia.org/r/1043243

Change #1043240 abandoned by Eccenux:

[mediawiki/core@master] PoC: embedded to inline SVG

Reason:

typo

https://gerrit.wikimedia.org/r/1043240

Sorry about the noise... 🙈 Finally got this right though. BTW. @SD0001 I think you have a typo in your rendering of attributes ;)

So this works (as a proof of concept at least):
https://patchdemo.wmflabs.org/wikis/0e24852599/wiki/Main_Page

What I didn't expect is that you can actually live edit SVG images out of the box in VisualEdit. When you, for example, change colors, you get a live preview of your changes. Pretty neat.

It should be safe out of the box, as img-src is designed to be secure. Browsers ensure that no scripts are executed.

There is one weird issue: the SVG disappears when not modified (you edit with VE; modify 1 of 2 SVG; check diff). So I guess some change in the VE configuration or something extra might be needed.
Embedded SVG editing in VisualEditor bug?

This still works great for me as a PoC. There are some things to work on still:

  • ✅Definitely should render width and height as the width and height of the img tag. Otherwise, you would have to scale elements of SVG, which is not ideal.
  • ✅Would need some unit tests, probably.
  • ✅Code cleanup.
  • ✅Need some decent error reporting or just remove the current debug... Not sure what's best route here.
  • ❌❔Should we add SVG in figure/figcaption tags? Maybe as an option? Or would we assume templates/modules will do that?
  • ❌⏱Figure out the VE problem mentioned above.
  • ❌Does this need a separate cache? Not sure. Doing base64 is most of the work there. Otherwise, it's just browser work.

But it should be usable even in its current form.

I would suggest making this an extension. I think this would be a very hard sell as a mw core feature. (Of course, being realistic, its still going to be a hard sell getting the extension deployed to WMF)

With Wikimedia deployment being the end goal, I suppose putting it in core behind a feature flag would cause less friction than a whole new extension?

It should be safe out of the box, as img-src is designed to be secure. Browsers ensure that no scripts are executed.

As for rasterized vs native load, should we honour the SVGNativeRendering option? It is disabled by default and in Wikimedia cluster for reasons I'm not sure about.

Change #1043323 had a related patch set uploaded (by Eccenux; author: Eccenux):

[mediawiki/core@master] PoC: embedded to inline SVG (clean)

https://gerrit.wikimedia.org/r/1043323

With Wikimedia deployment being the end goal, I suppose putting it in core behind a feature flag would cause less friction than a whole new extension?

I personally disagree, but i could be wrong. Just my 2 cents.

Change #1043243 abandoned by Eccenux:

[mediawiki/core@master] PoC: embedded to inline SVG

Reason:

https://gerrit.wikimedia.org/r/1043243

I think the PoC is cleaned up and ready to go.
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1043323

SD0001 mentioned adding parser tests, but because base64 is involved, I'm not sure if that is something worth exploring. Both writing and reading such tests would be hard. Also, I don't feel comfortable writing parser tests in that text format. I don't really know if setting up articles in this tests is required or not. I don't know if I need to run Parsoid tests or if it is just nice to have, etc. Having said that, I did add unit tests, and I think all paths are covered by unit tests already. So now that unit tests are here, maybe parser tests are not needed?

It should be safe out of the box, as img-src is designed to be secure. Browsers ensure that no scripts are executed.

As for rasterized vs native load, should we honour the SVGNativeRendering option? It is disabled by default and in Wikimedia cluster for reasons I'm not sure about.

Good question. Reading the T208578 task, I get the impression that wgSVGNativeRendering and wgSVGNativeRenderingSizeLimit are designed for images used on Commons (for current problems with them). That means these options result from the fact that we already have a complex situation with many different SVGs. Different SVGs can have a lot of elements (tags, path points) and render slowly. Some SVGs on Commons also contain translations. So, a full switch to SVG is simply unrealistic there.

Here however, we have a significantly different situation, because the nested SVG in the article does not exist yet. So we have the opportunity to start fresh and avoid some mess with complex SVG. There is also no problem with translations through mechanisms rendered in PNG. So I think wgSVGNativeRendering does not apply here.

This is the last demo with SVG enabled by default:
https://patchdemo.wmflabs.org/wikis/604876b5e5/wiki/Main_Page

The SVG embedding feature will be blocked by the new option. As SD0001 suggested, this is done in a similar way to allowing raw <html>.

So new option is $wgEmbeddedSvg:

  • default: false
  • description: Allow adding embedded SVG in "<svg>...</svg>" tags rendered as an inline img tag. This is safe as any inline SVG is secure by default. Browsers do not allow interactivity in img tags per HTML specification.
  • Enabling embedded SVG: $wgEmbeddedSvg=true

Change #1043111 abandoned by SD0001:

[mediawiki/core@master] [WIP] Support for inline SVG in wikitext

Reason:

in favour of Ida4cc0f8f57d4a34b61460384c3902ed53b8d67e

https://gerrit.wikimedia.org/r/1043111

To demo the use case mentioned in the ticket - which is enabling templatization of SVGs, we can use some SVG-writing lua library. https://gitlab.com/hansonry/luasvgwriter is the one I found in 5 minutes of searching.

On patchdemo, setting it up as Module:SVGWriter, I created Module:Circle svg to produce circles of customisable colour. Example uses are shown on Templated SVG test.

As far as I can see it's impossible to have a lua module insert an svg snippet into a wiki article.

Of course it’s impossible right now – this is why task exists. https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1043323 will make it possible.