Page MenuHomePhabricator

Application Security Review Request : Chart extension and chart-renderer service
Closed, ResolvedPublic

Description

Project Information

Description of the tool/project:
MediaWiki extension that allows editors to create charts and embed them in wiki pages, replacing the legacy Graph extension. Charts are rendered on the server by making an HTTP request to a private HTTP service based on Express and service-runner, which uses Apache eCharts to render charts as SVGs. Initial deployment will likely not involve client-side enhancement of charts, but this may be added later.

Note that the extension and service are still under development, in that we're still adding more chart types and features. But the structure of things is not expected to change much (except for T370378), so we think it's ready for a security review at this stage.

Description of how the tool will be used at WMF:
Will be deployed on all wikis (except loginwiki and fishbowl wikis). Users will use this to put charts in wiki pages. Chart definitions and data will be hosted on Commons, but charts can be embedded on pages on any wiki.

Dependencies
Chart extension:

  • The JsonConfig extension (already deployed in production)
  • An optional dependency on the GlobalUsage extension may be introduced by T370378, depending on how usage tracking is implemented

chart-renderer service:

  • Apache eCharts
  • The svgo NPM package, for SVG manipulation
  • Standard NPM packages for running an HTTP service: service-runner, express, body-parser, compression
    • We're aware that @tchin/service-utils is intended to replace service-runner, but we were told it wasn't stable enough to use yet. We're happy to migrate to this whenever is appropriate.
  • The commander NPM package, to allow the chart renderer to be invoked as a CLI script instead of an HTTP service, mainly intended for local development and for 3rd-party wiki support. This would not be used in production (but the NPM dependency would still be there). If required, we could explore restructuring the CLI wrapper such that the HTTP service can be built without the commander dependency.

Has this project been reviewed before?
No

Working test environment
The extension and the service are deployed in the beta cluster.

Post-deployment
The Charts task force is responsible for this extension and is actively developing it. The main point of contact is Jon Robson (@Jdlrobson), who is filling in for Roan Kattouw (@Catrope)'s parental leave until late November / early December.

Details

Risk Rating
Low

Event Timeline

This is a placeholder / early request. The development of this extension is only just getting started, so this review is not yet actionable at this time, but we'd like to get in the queue early. We will update this task when we have a clearer idea of when we'd like to deploy to production.

Catrope updated the task description. (Show Details)
Aklapper changed the task status from Open to Stalled.Jul 13 2024, 8:56 AM
Catrope moved this task from Backlog to Tracking on the Charts board.
Catrope renamed this task from Application Security Review Request : Chart extension (placeholder) to Application Security Review Request : Chart extension and chart-renderer service.Aug 30 2024, 12:39 AM
Catrope changed the task status from Stalled to Open.
Catrope updated the task description. (Show Details)
Catrope updated the task description. (Show Details)

This request is now ready to be acted on

We are aiming to deploy the charts service to production by mid-October. @sbassett you triaged this for the upcoming quarter, so I'm wondering whether or not that sounds reasonable. We could also split up the review for the service from the review of the extension if that speeds things up. Our tentative plan is to enable the extension on a production test wiki by end of October.

Hey @CCiufo-WMF -

Ideally we'd want to review any and all relevant codebases for the testwiki deployment. So - any extensions, services, config-as-code, etc. And for Charts, it might also be nice to have some notes regarding any soft dependencies like JsonConfig, hosting various data files at Commons, etc. If all of this is ready for review by early October, we should be able to target a mid-October security-review completion date.

@sbassett thanks for the info! That sounds doable on our end. We'll prioritize making sure everything is well documented and in a ready-to-review state by the beginning of October.

sbassett changed the task status from Open to In Progress.Oct 7 2024, 6:17 PM
sbassett assigned this task to mmartorana.
sbassett triaged this task as High priority.

Just a note that we are not yet ready for the Charts service + extension to be reviewed, but hope to be at that point by the end of next week. (cc @Jdlrobson)

Just a note that we are not yet ready for the Charts service + extension to be reviewed, but hope to be at that point by the end of next week. (cc @Jdlrobson)

Hi, thank you for the update. I’ve started working on the review and will provide a deadline in the coming days.

Hi @CCiufo-WMF and team, I understand that your plan is to deploy soon, but after some evaluation, I plan to submit my review by mid-November.

Does that timeline work for you?

Hi @CCiufo-WMF and team, I understand that your plan is to deploy soon, but after some evaluation, I plan to submit my review by mid-November.

Does that timeline work for you?

Hey @mmartorana, thank you and yes we are hoping to get to production as soon as possible. We are targeting an early November deployment to testwiki and it seems unlikely to have the review completed by then. That deployment isn't necessarily blocked on the review, but the next milestone is to deploy to pilot wikis by mid-November is something we'd feel more confident in once the review is complete. That doesn't leave much time to implement any feedback we receive from the review. Is there anything we can do to help accelerate the timeline of the review?

That deployment isn't necessarily blocked on the review, but the next milestone is to deploy to pilot wikis by mid-November is something we'd feel more confident in once the review is complete.

Ideally we'd have a security review completed prior to any production deployment, including testwiki. If you need to deploy to testwiki or any production wikis prior to the completion of a security review, we can have you accept a medium risk within our risk register until the review is completed.

That doesn't leave much time to implement any feedback we receive from the review.

For in-house developed code, mostly by seasoned WMF engineers, I would doubt that we'll find many (or any) security issues, at least issues beyond a low risk. But we may find a few things. Again, managers and directors can accept a medium risk if they need to deploy code prior to the completion of any security review.

Is there anything we can do to help accelerate the timeline of the review?

Likely no, except to ensure that all relevant code repositories are in stable, production-ready states. We try to accommodate project milestones for most teams as best we can, but we are not resourced to operate a fully on-demand security-review service.

Hi @sbassett - Thanks a lot for this helpful comment and context!
As the director working with this team, I am comfortable accepting a medium risk and then deploying to testwiki prior to the completion of the scan, given:

Let me know if there are additional steps I should take to move this forward / get us on the risk register. Thanks again!

@NBaca-WMF - ok, sounds good. I'll document you as accepting the medium risk for a pre-security-review deployment.

  • We are currently running a static security scan via Phan

That's great, though we do a lot more than just run Phan against codebases for our security reviews. And Phan would only concern the supporting MediaWiki extension and not the nodejs microservice.

  • The relatively limited risks mentioned for testwiki in particular, and

I wouldn't say there are limited risks for testwiki. It's a proper production wiki under the wikipedia.org domain. While enabling an extension on one production project is very likely lower-risk than enabling it across many or all production projects, there are still numerous attacks which could be launched via a single production wiki.

Hi everyone, I wanted to share an update to inform @acooper and the security team that this extension will undergo some architectural changes in the coming weeks.

As a result, I’m currently focusing on other areas of the extension while waiting the final changes to be deployed before I can proceed with my final review. Consequently, the timeline for my review will likely be adjusted due to these deployments.

My understanding is there is potentially a move being proposed from server-side to client-side rendered graphs.

That is going to be a significant change to the scope of the security review. It now brings in to scope cross-site scripting and other client-side security issues, which would not have previously been in scope.

It's also difficult to plan and conduct a security review while the system is under significant change.

I'd propose we have a meeting to discuss and plan next steps and target date for security review @Jdlrobson @CCiufo-WMF @NBaca-WMF

My understanding is there is potentially a move being proposed from server-side to client-side rendered graphs.

That is going to be a significant change to the scope of the security review. It now brings in to scope cross-site scripting and other client-side security issues, which would not have previously been in scope.

It's also difficult to plan and conduct a security review while the system is under significant change.

I'd propose we have a meeting to discuss and plan next steps and target date for security review @Jdlrobson @CCiufo-WMF @NBaca-WMF

I'll find a time for us to meet. To clarify: the proposal is to enhance ("hydrate") the server-rendered charts with client-rendered ones. So, this would be an additive change and the architecture of the current service mostly still applies.

My understanding is there is potentially a move being proposed from server-side to client-side rendered graphs.

That is going to be a significant change to the scope of the security review. It now brings in to scope cross-site scripting and other client-side security issues, which would not have previously been in scope.

It's also difficult to plan and conduct a security review while the system is under significant change.

I'd propose we have a meeting to discuss and plan next steps and target date for security review @Jdlrobson @CCiufo-WMF @NBaca-WMF

I'll find a time for us to meet. To clarify: the proposal is to enhance ("hydrate") the server-rendered charts with client-rendered ones. So, this would be an additive change and the architecture of the current service mostly still applies.

To expand on this the change in scope would be:

  • Loading echarts on the client but not publicly exposing it to editors or gadgets.
  • Rendering echarts using trusted specifications generated by the service.
  • Only rendering echarts that are present in the initial page HTML (and thus trusted).

Hi @CCiufo-WMF, @NBaca-WMF and team - following our meeting, I will remove this extension from our risk register since you plan to wait for our review before proceeding with deployment.

Thank you!

Hi @CCiufo-WMF, @NBaca-WMF and team - following our meeting, I will remove this extension from our risk register since you plan to wait for our review before proceeding with deployment.

Thank you!

Oh, just to clarify, we are still planning to deploy to test wiki before the review. We just won't deploy to any real wikis.

Oh, just to clarify, we are still planning to deploy to test wiki before the review. We just won't deploy to any real wikis.

test and test2 are real wikis in the sense that they live in Wikimedia production, very similarly to every other project. So the exception would need to stay in our risk register as at least a medium risk.

Also in the sense that testwiki is in $wgCrossSiteAJAXdomains. You should assume that if you compromise an account on testwiki (e.g. via an XSS) it compromises it on all wikis

Oh, just to clarify, we are still planning to deploy to test wiki before the review. We just won't deploy to any real wikis.

test and test2 are real wikis in the sense that they live in Wikimedia production, very similarly to every other project. So the exception would need to stay in our risk register as at least a medium risk.

Yes exactly, I am recommending we keep it in the register.

@sbassett @mmartorana following our conversation earlier in the week, we've added the components for client side hydration (with feature flag) so the codebase is now in a stable place and ready for your inspection and advice.

@sbassett @mmartorana following our conversation earlier in the week, we've added the components for client side hydration (with feature flag) so the codebase is now in a stable place and ready for your inspection and advice.

Great, thanks for the update.

I’ll continue with my review.

Hey @mmartorana, checking in on how this is progressing. We deployed Charts to test-commons and testwiki last week, and are preparing to deploy to test2wiki and production Commons as early as next week (for testing purposes only). Pilot wiki deployment would follow soon after but would be awaiting the results from this review.

Hey @mmartorana, checking in on how this is progressing. We deployed Charts to test-commons and testwiki last week, and are preparing to deploy to test2wiki and production Commons as early as next week (for testing purposes only). Pilot wiki deployment would follow soon after but would be awaiting the results from this review.

Hey @CCiufo-WMF, I’m planning to post my review by the end of next week.

Thanks for sharing your deployment plan with me.

Hey @mmartorana, checking in on how this is progressing. We deployed Charts to test-commons and testwiki last week, and are preparing to deploy to test2wiki and production Commons as early as next week (for testing purposes only). Pilot wiki deployment would follow soon after but would be awaiting the results from this review.

Hey @CCiufo-WMF, I’m planning to post my review by the end of next week.

Thanks for sharing your deployment plan with me.

Okay that would be great, thank you!

Hi @CCiufo-WMF and @Jdlrobson - do you plan to address T378305 before deployment?

Hi @CCiufo-WMF and @Jdlrobson - do you plan to address T378305 before deployment?

We are planning to address it, but probably not before deployment.

@mmartorana I know you mentioned that the review is due at the end of this week but I wanted to explicitly ask the following:

  • is there anything you are aware of that you think is a complete blocker to our progress.
  • if no complete blockers do you would be happy for us to proceed with testing things further by deploying to Commons and Test2Wiki tomorrow (November 21), as noted above?

Hi @Seddon - I’m wrapping up my review and haven’t found any blockers so far.

Feel free to proceed with your schedule.

Security Review Summary - T369950 - 2024-11-22
Last commit reviewed: 3079fd7

Summary

Overall, the chart extension is in good shape. The SAST and SCA results are excellent, core dependencies like SVGO and ECharts appear secure, and manual testing has not produced any significant results.

However, the json-config extension hasn’t been reviewed in some time (T163827, 5 years ago). I recommend conducting an updated review, as it plays a critical role in input validation.

A strong recommendation is to implement an input sanitization layer before data processing / validation in the chart-renderer service, ensuring that the data is not only in valid JSON format but also protected against injection attacks. Also, improving output escaping on the frontend will help ensure all dynamic data is safely managed on that layer too.

Implementing these improvements will significantly enhance the extension’s security.

The Security-Team is available to assist in incorporating these measures into your data handling pipeline.

For now, I’m assigning a low risk score.

Vulnerable Packages - Production
none

Vulnerable Packages - Development
none

Outdated Packages
none

Static Analysis Findings

semgrep:

                                    
src/ChartRenderer.php
❯❱ php.lang.security.unlink-use.unlink-use
      Using user input when deleting files with `unlink()` is potentially dangerous. A malicious actor
      could use this to modify or access files they have no right to.                                 
      Details: https://sg.run/rYeR                                                                    
                                                                                                      
      140┆ unlink( $dataPath );

sast-scan returned no results.
horusec returned no results.
snyk returned no results.
bearer returned no results.

General code health score
Risk:.

  1. The Wikimedia code health check tool returned a weighted risk score of 27.20:
+-----------+----------+----------+------+---------------+---------------+--------------+-------------+------------+--------------+-----------+---------------+
| Vuln Pkgs | Pkg Mgmt | Test Cov | SAST | Non-auto Cmts | Uniq Contribs | Contrib Conc | Lang Guides | Staff Supp | Task Backlog | Code Stew | Weighted Risk |
+===========+==========+==========+======+===============+===============+==============+=============+============+==============+===========+===============+
|         0 |        0 |        9 |    0 |             5 |            10 |            2 |           7 |         10 |            1 |         0 |         27.20 |
+-----------+----------+----------+------+---------------+---------------+--------------+-------------+------------+--------------+-----------+---------------+

General Security Issues (XSS, S/CSRF, SQLi, Ci, Cjack, etc.)
Risk: low

  1. T378305

Miscellanous Issues/Points of Discussion/Nits

  1. The Security-Team encourages the usage of service-utils as outlined in T362774#10207590, rather than service-runner.
  2. Currently lacking meaningful tests.

General Security Information

Statistic/InfoValueRisk
Repositorysvgo none
Relevant tag/branchmaster none
Last commit reviewed (if relevant) none
Recent contributions to code (6 months)35 low
Active developers with > 10 commits8 low
Current overall usage15.8M low
Current open security issues0 none
Methods for reporting security issuessecurity policy low

General Security Information

Statistic/InfoValueRisk
Repositoryecharts none
Relevant tag/branchmaster none
Last commit reviewed (if relevant) none
Recent contributions to code (6 months)77 low
Active developers with > 10 commits26 low
Current overall usage283k low
Current open security issues0 none
Methods for reporting security issuesASF reporting policy low

Thanks @mmartorana for the assessment ! Happy to hear this is low risk.

However, the json-config extension hasn’t been reviewed in some time (T163827, 5 years ago). I recommend conducting an updated review, as it plays a critical role in input validation.

That does seem like a good idea. I've opened T380653.

A strong recommendation is to implement an input sanitization layer before data processing / validation in the chart-renderer service, ensuring that the data is not only in valid JSON format but also protected against injection attacks.

I've opened T380652 for that one.

Also, improving output escaping on the frontend will help ensure all dynamic data is safely managed on that layer too."

Could you elaborate on which part of the frontend would benefit from output escaping so I could capture this correctly in a task?

On the frontend we only render chart elements with a data-mw attribute (which is not possible to create via wikitext) https://github.com/wikimedia/mediawiki-extensions-Chart/blob/master/resources/ext.chart/bootstrap.js#L5

The JSON that we render is encoded on the server and decoded on the client: https://github.com/wikimedia/mediawiki-extensions-Chart/blob/master/resources/ext.chart/bootstrap.js#L21

Thanks @mmartorana for the assessment ! Happy to hear this is low risk.

However, the json-config extension hasn’t been reviewed in some time (T163827, 5 years ago). I recommend conducting an updated review, as it plays a critical role in input validation.

That does seem like a good idea. I've opened T380653.

A strong recommendation is to implement an input sanitization layer before data processing / validation in the chart-renderer service, ensuring that the data is not only in valid JSON format but also protected against injection attacks.

I've opened T380652 for that one.

Also, improving output escaping on the frontend will help ensure all dynamic data is safely managed on that layer too."

Could you elaborate on which part of the frontend would benefit from output escaping so I could capture this correctly in a task?

Hi @Jdlrobson - The goal is to implement a defense-in-depth strategy by adding multiple layers of protection (i.e sanitize early, encode late). This ensures that if one layer fails, others will help mitigate the risk.

Ideally, we should add an escaping layer right before the data is passed to the DOM for rendering. However, there are various approaches to achieving this, so we can discuss the best solution if needed.

Jdlrobson edited projects, added Charts (Sprint 11); removed Charts.
Jdlrobson moved this task from Incoming to Ready for Signoff on the Charts (Sprint 11) board.

Ideally, we should add an escaping layer right before the data is passed to the DOM for rendering. However, there are various approaches to achieving this, so we can discuss the best solution if needed.

Yeh would love to chat through this some more so I understand this correctly.

I suspect the line in question would be https://gitlab.wikimedia.org/repos/mediawiki/services/chart-renderer/-/blob/main/src/lib/render.ts#L51 which escapes and encodes the data-spec
and then reversed in https://github.com/wikimedia/mediawiki-extensions-Chart/blob/master/resources/ext.chart/render.js#L105 but I'm not sure how exactly we might tighten that.

i.e sanitize early,

Checked in with @aude today and T380652 should help us address that and is understood!

Ideally, we should add an escaping layer right before the data is passed to the DOM for rendering. However, there are various approaches to achieving this, so we can discuss the best solution if needed.

Yeh would love to chat through this some more so I understand this correctly.

I suspect the line in question would be https://gitlab.wikimedia.org/repos/mediawiki/services/chart-renderer/-/blob/main/src/lib/render.ts#L51 which escapes and encodes the data-spec
and then reversed in https://github.com/wikimedia/mediawiki-extensions-Chart/blob/master/resources/ext.chart/render.js#L105 but I'm not sure how exactly we might tighten that.

i.e sanitize early,

Checked in with @aude today and T380652 should help us address that and is understood!

There are several approaches to achieve this; I will reach out to discuss it with you.

I put something in your calendar for this week!