Security review new version of the Vega lib
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Yurik
	Aug 9 2017, 7:04 PM

Description

The new version of the Vega library (3.0) underwent a significant rewrite by University of Washington folks, and needs to have a security review in order to update the graph extension and the Graphoid service. Benefits include much more powerful visualization primitives, better map support, many minor cleanups.

Main site: https://vega.github.io/vega/
Github: https://github.com/vega/vega

Related Objects
Search...

Status	Subtype	Assigned	Task
Open	BUG REPORT	CCiufo-WMF	T334940 All Graphs broken on Wikimedia wikis (due to security issue T336556)
Resolved	Security	Jdlrobson	T334895 XSS via Graph extension
Duplicate		None	T292341 [Epic] Graphs extension maintenance
Declined		None	T335128 Update documentation and examples of Extension:Graph after deployment of Vega 5
Declined		None	T346291 Re-enable the Graph Extension for use at all Wikimedia Wikis
Declined	Feature	Jdlrobson	T165118 Support Vega 5.0+
Resolved		Bawolff	T172938 Security review new version of the Vega lib
Declined		None	T182536 Fix security issues found in Graphs extension during review of vega 2

Event Timeline

Yurik created this task.Aug 9 2017, 7:04 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 9 2017, 7:04 PM

Yurik added a project: deprecated-security-team-reviews.Aug 9 2017, 7:26 PM

Yurik mentioned this in T147347: Migrate Kartographer to Leaflet 1.x.Aug 9 2017, 9:22 PM

Bawolff subscribed.Oct 4 2017, 5:03 PM

• dpatrick mentioned this in E755: Security review new version of the Vega lib.Oct 12 2017, 3:09 PM

• dpatrick moved this task from Incoming to Scheduled on the deprecated-security-team-reviews board.

@Yurik I assume this library will have a similar wrapper to graph2.js that graph currently does? And it will be used both locally and in a node.js service?

@Bawolff correct, and it will be loaded side-by-side on nodejs, and hopefully - in the client-preview. Without side-by-side, preview will only show graphs of one version.

Base added a parent task: T165118: Support Vega 5.0+.Oct 16 2017, 10:44 PM

Some of the things I'm going to mention also apply to the existing vega 2 graphs

Applying to the current vega 2 wrapper

For CORS, in the interest of paranoia, could we set the origin parameter to be * instead of the actual origin. This wasn't available when the wrapper was originally written, but it extra ensures that cross-domain requests won't use cookies.
Currently the title parameter is checked to make sure it doesn't have a | (pipe) character in it. We also need to check for \x1f (That is ASCII control character UNIT SEPARATOR) as the api also uses that to separate titles. Alternatively we could validate with mw.Title
wikititle:/// is supposed to prevent query paramters from being used, however it could probably be bypassed if they are percent encoded due to T96274 (e.g. https://en.wikipedia.org/wiki/Main_Page%3faction=history%26curid=2120 is interpreted by our servers incorrectly )

For vega 3 library

So mostly this appears good. The same url restrictions that are used with vega2 would of course have to be used here. There are some concerning features though:

bind.element can be used to place text boxes in arbitrary parts of the page (with the data readable by vega). This could maybe be used to trick users
data.on.source - vega supports capturing various events. Including keyup and keydown, even for arbitrary dom elements outside of the vega drawing. This could potentially be used to make a keylogger in vega.
expressions are allowed to access the global screen js object. There's some privacy implications of this, revealing the users screen size. But its minor, so probably ok, not 100% sure.
This is a very large library. Well it looks well made, I certainly haven't manually audited all 35,593 lines

With that in mind, I would like to propose a sandboxing approach. How feasible would it be to place the vega output into an iframe like <iframe sandbox="allow-scripts allow-top-navigation" frameborder=0 width=500 height=500 srcdoc="vega graph here"> . I think that would address concerns about vega potentially spying on events for dom nodes unrelated to it, and also make the entire thing much more robust from a security prespective. (I think IE might need to do something different than srcdoc but that work arounds for it exists)

There's one other issue: fill and stroke - appear to support url(http://foo.com/bar.svg#pattern) which may allow loading external resources (in violation of the privacy policy). I haven't managed to fully reproduce this though so I'm unsure. If it was possible to load an external resource that way, that would be bad.

There's one other issue: fill and stroke - appear to support url(http://foo.com/bar.svg#pattern) which may allow loading external resources (in violation of the privacy policy). I haven't managed to fully reproduce this though so I'm unsure. If it was possible to load an external resource that way, that would be bad.

Ok, so I tracked down fill using external resources (Surprisingly difficult to find docs on this). Seems like firefox is the only one who really supports this, and only for resources from the same origin https://web.archive.org/web/20120318020837/https://developer.mozilla.org/web-tech/2008/10/10/svg-external-document-references/ (Yes, there's so little docs on this I had to go to archive.org to find the feature announcement :s)

But basically, if vega is in svg mode (canavas mode is not affected) you can make arbitrary GET requests to the same origin, in firefox. Since this is restricted to the same origin, its much less of a privacy issue. Could still be used as a data-exfiltration method via caching headers, but that's the same as all the other url access stuff.

Also the sandbox approach suggested above would make it not run in the same origin, which in theory should totally disable that feature of svgs (yay!).

Which brings me to another point - if we do the sandbox thing, it will mean all AJAX requests will be CORS, even to the local domain. I think that's acceptable (arguably perhaps even better than "Treat-as-untrusted" although we may still want that header hack in case any browsers don't support the sandbox).

• dpatrick triaged this task as Medium priority.Oct 18 2017, 5:21 PM

@Bawolff I did a small standalone test with an iframe and strict CSP rules (no-eval). The sample actually sends Vega data from the main window into the frame via the window.postMessage(), but it could be done via the URL param as well. The postmessage passing would be useful for page save preview, whereas the url-param case is good when iframe can load the graph directly via an api call.

Code: https://github.com/nyurik/csp-vega-test
Try it at: http://sophox.org/csp/sandbox/index.html

P.S. Doing it via iframe means that the iframe src must point to some endpoint that will serve the needed html, javascript, and CSS. Which means the core MW or custom API may need to support this usecase. The iframe does not need to load the entire MW JS codebase. At the minimum, it should load:

some tiny HTML, which might actually be static
some loader that initailizes Vega with the graph data
the entire Vega library (vega, vegalite, ...) - minified or not depending on the debug= param

BTW, the same method can be used even for user-submitted JavaScript - e.g. d3 visualizations - iframe sandbox should prevent it from accessing other domains (custom iframe CSP?), and by virtue of sandboxing - perform any other undesirable actions. Moreover, it might even make sense to put the entire content portion into the iframe in the future.

In T172938#3894690, @Yurik wrote:

P.S. Doing it via iframe means that the iframe src must point to some endpoint that will serve the needed html, javascript, and CSS. Which means the core MW or custom API may need to support this usecase. The iframe does not need to load the entire MW JS codebase. At the minimum, it should load:

some tiny HTML, which might actually be static

some loader that initailizes Vega with the graph data

the entire Vega library (vega, vegalite, ...) - minified or not depending on the debug= param

BTW, the same method can be used even for user-submitted JavaScript - e.g. d3 visualizations - iframe sandbox should prevent it from accessing other domains (custom iframe CSP?), and by virtue of sandboxing - perform any other undesirable actions. Moreover, it might even make sense to put the entire content portion into the iframe in the future.

Putting a doc in the src parameter is one method but an alternative that might be easier is the srcdoc attribute (some issues on older MSIE but i believe there are work arounds)

BTW, the same method can be used even for user-submitted JavaScript - e.g. d3 visualizations - iframe sandbox should prevent it from accessing other domains (custom iframe CSP?), and by virtue of sandboxing - perform any other undesirable actions.

Indeed there are possibilities there. Ive sometimes heard people discuss using that approach to support scripted svgs (to add support to flash applet type support to wikimedia). However I wouldnt want this to be our only layer of defense when dealing with potentially malicious content, but more one part of the defense in case another layer breaks.

@Bawolff are there any CSP issues with srcdoc? In my example, parent can have a different CSP rules than iframe. This way you can enable unsafe-eval in the frame, without allowing it in the parent.

@Bawolff, yes, I just checked, the srcdoc attribute does not allow you to relax CSP rules. So if the site blocks unsafe-eval, <iframe srcdoc=" ... <meta content-security-policy=...>"> cannot relax it. Specifying the src= allows it because the CSP comes via the header for the different HTML page.

(As an aside see also T169027 )

Legoktm removed a project: MediaWiki-Vendor.Sep 14 2018, 1:25 AM

akosiaris mentioned this in T211881: graphoid: Code stewardship request.Dec 13 2018, 2:06 PM

• chasemp moved this task from Waiting to Incoming on the deprecated-security-team-reviews board.Dec 17 2018, 6:38 PM

• charlotteportero moved this task from Incoming to Frozen on the deprecated-security-team-reviews board.Jan 7 2019, 6:25 PM

In T172938#3894863, @Yurik wrote:

@Bawolff are there any CSP issues with srcdoc? In my example, parent can have a different CSP rules than iframe. This way you can enable unsafe-eval in the frame, without allowing it in the parent.

Well ideally we would not have unsafe-eval on either...but its not like we're removing unsafe-eval on WMF websites anytime soon.

so I think we've done everything on our end. See my comment above for my comments on the library - please address those things before using vega 3. Let me know if you have any questions.

sbassett moved this task from Frozen to Done on the deprecated-security-team-reviews board.Apr 24 2019, 6:05 PM

@Bawolff is there anything in this ticket that is sensitive? There is a discussion about GSOC student tackling the Vega 3 upgrade, and they would need access to this discussion.

Bawolff changed the visibility from "Custom Policy" to "Public (No Login Required)".May 8 2019, 3:45 PM

Bawolff mentioned this in T222806: Security Review for Vega 5 and Vega-Lite JavaScript Libraries.May 8 2019, 4:24 PM

Bawolff mentioned this in T222807: Sandbox Graph extension into an iframe.

@Bawolff I was re-reading our notes in light of the resumed interest in this project. Assuming we are not yet rolling out the sandbox approach per our discussion, I have began looking closely at the issues you raised:

bind.element can be used to place text boxes in arbitrary parts of the page (with the data readable by vega). This could maybe be used to trick users

Essentially this allows any <div> to become a container for input fields. This is very useful for creating interactive graphs like job voyager -- that input field at the bottom could actually be placed somewhere else in the wiki page, with a nice styling around it, so that users can type in a job name and the graph would show just that. We could for now simply throw an error if the vega graph contains any signals.bind.element values, or we could even just delete them -- Vega will place those inputs next to the graph itself (like the example above).

data.on.source - vega supports capturing various events. Including keyup and keydown, even for arbitrary dom elements outside of the vega drawing. This could potentially be used to make a keylogger in vega.

How dangerous is this, considering that 1) AFAIK, login happens in a separate window, not while Vega is loaded, and 2) Vega has no access to the outside hosts, thus it cannot transmit that data anywhere. I might of course be missing some other case?

expressions are allowed to access the global screen js object. There's some privacy implications of this, revealing the users screen size. But its minor, so probably ok, not 100% sure.

true, but same question - if Vega cannot transmit this info anywhere, is this a concern?

This is a very large library. Well it looks well made, I certainly haven't manually audited all 35,593 lines

heh, and it just keeps growing :) Luckily Vega has now been in other high profile products, like Elastic's Kibana, so there are many more eyes on it now doing security auditing.

• chasemp removed a project: deprecated-security-team-reviews.Jan 8 2020, 4:16 PM

• chasemp added a project: Application Security Reviews.Jan 8 2020, 4:39 PM

• chasemp moved this task from Incoming to Our Part Is Done on the Application Security Reviews board.Jan 8 2020, 4:43 PM

• chasemp added a project: Security.Feb 10 2020, 10:58 PM

• chasemp removed a project: acl*security.Feb 20 2020, 8:17 PM

• chasemp added a project: secscrum.Mar 10 2020, 8:11 PM

• chasemp moved this task from Incoming to Our Part Is Done on the secscrum board.Mar 10 2020, 8:19 PM

Jdlrobson closed subtask T182536: Fix security issues found in Graphs extension during review of vega 2 as Declined.Apr 24 2023, 11:56 PM