Page MenuHomePhabricator

Security Review For Wikipedia Previews
Closed, ResolvedPublicSecurity

Description

Project Information

Description of the tool/project

It's a javascript library that 3rd parties will use to display Wikipedia articles previews on their own platforms.

Description of how the tool will be used at WMF

Won't be used at WMF directly but it uses this API: https://<lang>.wikipedia.org/api/rest_v1/page/summary/<article title>

Dependencies

Run time: https://github.com/developit/unfetch
Build time: webpack, babel, mocha, etc (see package.json)

Has this project been reviewed before?

Never reviewed

Working test environment

A live demo is available at: https://wikimedia.github.io/wikipedia-previews/

To set it up locally:

git clone git@github.com:wikimedia/wikipedia-previews.git
cd wikipedia-previews
npm install
npm run dev (the local url will be printed in the console)

Post-deployment

Inuka-Team
@SBisson

Event Timeline

Hi @SBisson - can you please let us know what your target deployment date is?

Cheers,

Jennifer

Hi @SBisson - can you please let us know what your target deployment date is?

Cheers,

Jennifer

There is no deployment per se since this is a self-hosted (by 3rd parties) javascript library. I don't know if we have an official launch timeframe yet. Maybe @AMuigai knows.

Hey @SBisson - it would definitely be helpful to get some idea of a ballpark launch date for this code to help us better schedule the review. If the answer to that is indeterminate, we'll most likely mark this low priority. I mean, the code is already live on github with at least an implicit Wikimedia/WMF endorsement, but I have no idea what the plan is to publicize and promote this code. Are there initial websites/organizations we're targeting for usage of this library?

@sbassett getting initial feedback in the next few weeks would be ideal for us. We have one partner organization that just started looking at it internally as a rough prototype. If they come back and they want to use it in production, we would like to be able to give them the go ahead without too much delays.

sbassett triaged this task as Medium priority.
sbassett moved this task from Incoming to In Progress on the deprecated-security-team-reviews board.
sbassett added a project: user-sbassett.
sbassett moved this task from Backlog to In Progress on the user-sbassett board.

@SBisson - Apologies for the delay on this, I should have a review completed and posted on-task by tomorrow (2019-01-23) EOD at the latest. Thanks for your patience.

sbassett raised the priority of this task from Medium to Needs Triage.EditedJan 24 2020, 10:59 PM
sbassett set Security to Software security bug.
sbassett changed the visibility from "Public (No Login Required)" to "Custom Policy".
sbassett changed the subtype of this task from "Task" to "Security Issue".

A few minor issues for a "live" Wikimedia library. Once items are acked/resolved, this should be able to be made public.

sbassett triaged this task as Medium priority.EditedJan 24 2020, 10:59 PM

Security Review Summary - T240010 - 2020-01-24
Overall, this looks fine to me, especially since it lives outside of Wikimedia production (or what is typically considered such) and is goverened by the "AS IS" terms of its license. I did find some vulnerable/outdated (dev) packages and a couple of minor issues/points of discussion below.

Vulnerable Packages
As reported by npm audit and snyk test:

  1. serialize-javascript@1.9.1, Cross-site Scripting [XSS] (high risk)
  2. kind-of@6.0.2, Information Disclosure (low risk)

n.b. unfetch, the only non-dev dependency, looked pretty healthy according to its issues and security trackers.

Outdated Packages
As reported via npm outadated:

No explicit vulnerabilities reported (no risk), simply noting for completeness' sake.

PackageCurrentWantedLatest
jsdom15.2.115.2.116.0.1
mocha6.2.26.2.27.0.0

General Security Issues

  1. The only real concern here - which I'd probably rate as low risk for now - is that the code assumes anything pulled from Wikipedia's page summary endpoint is secure. While that should be a fair assumption, it's fine (and encouraged by the Security-Team) to add additional sanity checks (perhaps in renderPreview or just prior to its return value being written to the DOM) for unexpected/nefarious html or javascript.

Policy Compliance

  1. Has WMF-Legal reviewed this? I'm not really sure what their policy is for external-facing libraries like this. I'd assume the included license is fine and that any connections to the page summary endpoint are governened by Wikimedia's PP.

Nits

  1. Just noting that index.html and README.md promote the use of external (non-bundled, non-wmf) resources (e.g. https://cdn.jsdelivr.net/npm/promise-polyfill@8/dist/polyfill.min.js) which is obviously fine in the context of this library.
  2. Just noting that i18n.js appears to only support en, fr and es for now. I assume there are plans to add other languages or just generalize on the en, though this is at least slightly contradictory with the "Works for LTR languages" claim stated in README.md.
sbassett moved this task from Backlog / Other to Operational issues on the acl*security board.
sbassett moved this task from In Progress to Waiting on the Application Security Reviews board.
sbassett moved this task from In Progress to Waiting on the user-sbassett board.

Thanks @sbassett for this thorough review.

Security Review Summary - T240010 - 2020-01-24
Overall, this looks fine to me, especially since it lives outside of Wikimedia production (or what is typically considered such) and is goverened by the "AS IS" terms of its license. I did find some vulnerable/outdated (dev) packages and a couple of minor issues/points of discussion below.

Vulnerable Packages
As reported by npm audit and snyk test:

  1. serialize-javascript@1.9.1, Cross-site Scripting [XSS] (high risk)

I've updated serialize-javascript in this commit.

  1. kind-of@6.0.2, Information Disclosure (low risk)

I've updated a bunch of dependencies in this commit but it looks like version 6.0.2 is still present. Not sure how to force it to up to version 6.0.3.

n.b. unfetch, the only non-dev dependency, looked pretty healthy according to its issues and security trackers.

I used unfetch as a shortcut to support IE but I will replace it with good old fashion XMLHTTPRequest so we don't have to worry about this in the future and it's going to bring the size of the bundle down a little. T243774: Replace unfetch with XMLHTTPRequest I expect this task to be done this quarter.

Outdated Packages
As reported via npm outadated:

No explicit vulnerabilities reported (no risk), simply noting for completeness' sake.

PackageCurrentWantedLatest
jsdom15.2.115.2.116.0.1
mocha6.2.26.2.27.0.0

I've updated those packages in this commit.

General Security Issues

  1. The only real concern here - which I'd probably rate as low risk for now - is that the code assumes anything pulled from Wikipedia's page summary endpoint is secure. While that should be a fair assumption, it's fine (and encouraged by the Security-Team) to add additional sanity checks (perhaps in renderPreview or just prior to its return value being written to the DOM) for unexpected/nefarious html or javascript.

I'm open to adding some checks here but I would need some advice to make sure it's actually useful. I have filled T243778: Sanitize API response used in Wikipedia Previews HTML to discuss and track the issue.

Policy Compliance

  1. Has WMF-Legal reviewed this? I'm not really sure what their policy is for external-facing libraries like this. I'd assume the included license is fine and that any connections to the page summary endpoint are governened by Wikimedia's PP.

I will let @AMuigai respond here.

Nits

  1. Just noting that index.html and README.md promote the use of external (non-bundled, non-wmf) resources (e.g. https://cdn.jsdelivr.net/npm/promise-polyfill@8/dist/polyfill.min.js) which is obviously fine in the context of this library.

This will go away with T243774: Replace unfetch with XMLHTTPRequest as well.

  1. Just noting that i18n.js appears to only support en, fr and es for now. I assume there are plans to add other languages or just generalize on the en, though this is a least slightly contradictory with the "Works for LTR languages" claim stated in README.md.

This is more about functionality than security. We'll add other keys in i18n.js as needed. It works to a degree for any LTR languages, as seen by the example in Chinese, but the word "Wikipedia" might not be translated.

@SBisson @AMuigai - some follow-up here:

I've updated serialize-javascript in this commit.

Looks good, thanks!

I've updated a bunch of dependencies in this commit but it looks like version 6.0.2 is still present. Not sure how to force it to up to version 6.0.3.

Interesting. The new versions of webpack look good to me, so I'm not sure why this would be happening. Caching issue? Did you have any luck with it updating properly with a completely fresh npm install?

I used unfetch as a shortcut to support IE but I will replace it with good old fashion XMLHTTPRequest so we don't have to worry about this in the future and it's going to bring the size of the bundle down a little. T243774: Replace unfetch with XMLHTTPRequest I expect this task to be done this quarter.

Ok, good to know. Just to clarify, unfetch seems fine from a security standpoint, but going with a more pure JS implementation would most likely reduce the attack surface a bit, as long as the code follows security best practices (which I'm sure it will).

I've updated those packages in this commit.

Looks good, thanks!

I'm open to adding some checks here but I would need some advice to make sure it's actually useful. I have filled T243778: Sanitize API response used in Wikipedia Previews HTML to discuss and track the issue.

Thanks. Just to clarify (as I commented on the new task), this should be considered low risk (and most likely low priority) given our trust of parser output. It's a security best practice to sanitize data as near to its point of use/output as possible (in this case, when html/JS is rendered or added to the DOM) and it doesn't hurt to add some additional (even if seemingly redundant) hardening in case there is ever an issue with parser output, a different API is ever used, a nefarious user figures out an injection with the page previews library, etc.

I will let @AMuigai respond here.

Ok, I assume everything is fine.

This will go away with T243774: Replace unfetch with XMLHTTPRequest as well.

Great!

This is more about functionality than security. We'll add other keys in i18n.js as needed. It works to a degree for any LTR languages, as seen by the example in Chinese, but the word "Wikipedia" might not be translated.

Ok, thanks.

Hey @SBisson -

Just wanted to follow up on this and see if you'd had any luck:

I've updated a bunch of dependencies in this commit but it looks like version 6.0.2 is still present. Not sure how to force it to up to version 6.0.3.

Interesting. The new versions of webpack look good to me, so I'm not sure why this would be happening. Caching issue? Did you have any luck with it updating properly with a completely fresh npm install?

Otherwise, with T243778 filed, I think this task can probably be resolved. Though I'll plan to keep it private until some solution for T243778 is implemented and deployed.

Adding @hueitan while Stephane is on leave.

@sbassett I'll cross check with legal, thanks for bringing this up.

sbassett changed the task status from Open to Stalled.Feb 25 2020, 6:22 PM
sbassett closed this task as Resolved.EditedApr 15 2020, 3:22 AM

Ping @AMuigai and @SBisson -

Any updates on the comments from T240010#5865646? I think I'll resolve this task, but leave it private for now for the reason noted within that comment.

sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".Apr 2 2021, 6:45 PM
sbassett changed the edit policy from "Custom Policy" to "All Users".