Page MenuHomePhabricator

d3 and Vega dependencies should be managed by foreign-resources.yaml file
Closed, ResolvedPublic

Description

Currently the Graph extension has dependencies on the d3 and Vega libraries. These are currently static files inside the code repository meaning anyone can edit them or swap them out with untrusted / unaudited files.

TODO

  • These libraries should be managed by a foreign-resources.yaml file. It should be possible to run a script and verify the files match the published versions.
  • The Growth team has a more modern custom d3.js build that has gone through a security review. We should use that if possible.
  • Update to Vega 5.25 which has the latest ES5 build.
  • Vega includes D3 so there is no need to provide our own D3 library.

Sign off notes

Opened T335519 for followed up work.

Notes

Formerly considered blocked on T330508

Event Timeline

Change 910593 had a related patch set uploaded (by Jdlrobson; author: Jdlrobson):

[mediawiki/extensions/Graph@master] Manage foreign resources via maintenance script

https://gerrit.wikimedia.org/r/910593

Jdlrobson triaged this task as High priority.
Jdlrobson updated Other Assignee, added: Tgr.
Jdlrobson added subscribers: sbassett, Tgr.

@sbassett would it be preferable for Graph to use the exact same custom build as GrowthExperiments or to continue to use the existing NPM version from 3.5.17 ?

@Tgr can the maintenance script be updated to pull from GrowthExperiments if need be?

I think the options are:

  1. Don't use foreign-resources.yaml to manage d3, do it like GrowthExperiments (T321215: Use d3 subpackages instead of the full bundle): declare the relevant d3 modules as npm dependencies, use rollup to build (or simply copy over the result of doing that process in GrowthExperiments). Possibly in core so that Graph and GrowthExperiments can share the same ResourceLoader module (wouldn't make a huge difference as one uses d3 in article space and the other on a special page, but d3 is an ubiquitous library and it seems likely that more things will use it eventually). Optimal performance, happens to work on our supported browsers (although there is no reason it should and further changes might introduce unsupported syntax into our custom subset of d3 as well), but no audit trail of third-party library changes and dependency on a JS tool.
  2. Same but use foreign-resources.yaml to pull the d3 source(s) and use rollup to build those. Fixes the audit trail issue (although we'd still have to trust rollup itself).
  3. Use the full d3 library. Can be done with foreign-resources.yaml, but does not work with our minifier so we'd have to fetch the minified code and exclude it from ResourceLoader minification (911326) which I think mostly moots the benefits of having an audit trail. In terms of browser support it does seem to meet our requirements.

Given the plans for T279108: Introduce a Front-end Build Step for MediaWiki Skins and Extensions I don't think we want to put in much effort now (although all three of these are low-effort). If we go with #1 we can introduce a documentation-only foreign-resources.yaml entry type so that file can still be used for documentation.

cc @kostajh and @Sgs who have been thinking about this for GrowthExperiments.

So in c910593, we're doing (2) above but currently without the rollup build step? There doesn't seem to be a huge difference between (1) and (2) AFAICT since they both depend upon rollup, which I think we're generally fine using, especially given the far-worse alternatives.

So in c910593, we're doing (2) above but currently without the rollup build step? There doesn't seem to be a huge difference between (1) and (2) AFAICT since they both depend upon rollup, which I think we're generally fine using, especially given the far-worse alternatives.

That patch is currently doing (3) but without fixing the minifier bug (you can see that it removes the /*@nomin*/ from lib/vega/vega.js; that prevents the module from loading as JSMin doesn't understand the syntax).

That patch is currently doing (3) but without fixing the minifier bug (you can see that it removes the /*@nomin*/ from lib/vega/vega.js; that prevents the module from loading as JSMin doesn't understand the syntax).

Ok, so that depends on c911326 getting merged (seems close?) and that should support at least an acceptable nomin process? Regarding an audit trail, I think I (and Security-Team) would mostly be concerned about dependencies being discoverable in more standard, machine-readable ways via package and lock files, even if such files weren't used as part of an official commit or build step, but merely matched the current dependencies in-use. That way things like libup and other SCA tools we use would still be able to find things like this automatically for better vulnerability management.

Ok, so that depends on c911326 getting merged (seems close?)

The patch works but I'm not sure there is agreement on (3) being the right way.

Regarding an audit trail, I think I (and Security-Team) would mostly be concerned about dependencies being discoverable in more standard, machine-readable ways via package and lock files, even if such files weren't used as part of an official commit or build step, but merely matched the current dependencies in-use.

That was always the case (rEGRAed178f241512: Add Vega5 with Vega 2 to 5 compatibility layer with tests did add the library to package.json, it just did a small modification on it), but if you are partial to foreign-resources.yaml we could just add a new documentation-only type to it if we go with (1).

I'm not sure any of these provide a better security advantage than the others, which would be my primary concern. It seems like (3) uses the least amount of new dependencies and new code, which is always a good thing IMO. But the others really only seem to introduce an additional rollup build step, which is still quite a bit better than the current state and we generally trust rollup over webpack et al. It might make sense to security-review rollup a bit more in-depth, but I don't think that should really block this work.

Change 910593 merged by jenkins-bot:

[mediawiki/extensions/Graph@master] Manage foreign resources via maintenance script

https://gerrit.wikimedia.org/r/910593

Per the docs,

Using Vega and D3 together. The full vega.js and vega.min.js files bundle up all dependencies, including d3 modules and topojson-client. If you plan to independently use d3.js on your page, you can use a smaller Vega bundle that excludes redundant d3 files. Import d3 first, then import the smaller vega-core.min.js file to reduce the total file size. If you plan to load TopoJSON data files, you’ll need to import the topojson-client package as well.

Change 912797 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/Graph@master] Update Vega to 5.25

https://gerrit.wikimedia.org/r/912797

Fortunately, the upstream issue was fixed and 5.25 released a few hours ago, so we don't need a build step (for now, at least).

Change 912781 had a related patch set uploaded (by Jdlrobson; author: Gergő Tisza):

[mediawiki/extensions/Graph@master] Remove d3, already included in Vega

https://gerrit.wikimedia.org/r/912781

Change 912781 merged by jenkins-bot:

[mediawiki/extensions/Graph@master] Remove d3, already included in Vega

https://gerrit.wikimedia.org/r/912781

Change 912797 merged by jenkins-bot:

[mediawiki/extensions/Graph@master] Update Vega to 5.25

https://gerrit.wikimedia.org/r/912797

Thanks @Tgr! Yes this looks done to me. T335519 tracks the remaining follow up work.