Security review of mediawiki-services-chromium-render
Closed, ResolvedPublic
Actions

Description

Project Information

Name of tool/project: mediawiki-services-chromium-render
Project home page: https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/services/chromium-render
Name of team requesting review: Readers Web
Primary contact: @phuedx
Target date for deployment: Q2 FY2017-2018
Link to code repository / patchset: https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/services/chromium-render
Programming Language(s) Used: JavaScript (targetting Node.js)

Description of the tool/project / Description of how the tool will be used at WMF

Services maintain the Electron-based PDF rendering service. Under its current load, the service hangs regularly after consuming a large amount of memory and can also fail to restart gracefully (see T174916, T159922, and T172815 for additional context and discussion).

Electron, a Node.js based desktop application development platform, is based on headless Chromium. By driving headless Chromium directly, rather than via a high-level binding it, we believe that we can make the service simpler and easier to maintain.

We (Readers Web and, eventually, Readers Infra) aim to build a POC replacement for the Electron-based render service, using puppeteer to programmatically control a firejailed headless Chromium process for rendering PDFs. We intend to slave the replacement service to the existing service in order to determine whether it's a suitable replacement.

Dependencies

GoogleChrome/puppeteer
- GoogleChrome/puppeteer is a high-level JavaScript (targetting Node.js) binding to the Chromium DevTools protocol. It allows a developer to programmatically control headless (or not!) Chromium.

Has this project been reviewed before?

No.

Note well that Services, who are currently responsible for the Electron-based render service, will also be providing concept review for the project.

Working test environment

http://chromium-pdf.wmflabs.org/

Example URLs

http://chromium-pdf.wmflabs.org/en.wikipedia.org/v1/pdf/Berlin/Letter

Post-deployment

Readers Web will be responsible for the service immediately after its deployment and while it's evaluated. If, after evaluation, the headless Chromium based renderer supersedes the current Electron-based renderer, then Readers Infrastructure will take over responsibility.

Contacts

Team	Contact
Readers Web	@phuedx
Readers Infra	@Jhernandez

Related Objects
Search...

Status	Assigned	Task
Resolved	ovasileva	T181079 [GOAL] Provide an expanded reading experience by improving the ways that users can download articles of interest for later consumption
Resolved	None	T181084 [EPIC] Deploy the mediawiki-services-chromium-render service (Proton)
Resolved	Bawolff	T177765 Security review of mediawiki-services-chromium-render

Event Timeline

phuedx created this task.Oct 9 2017, 1:13 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 9 2017, 1:13 PM

phuedx added a parent task: T176627: Trial replacing Electron with headless Chromium in the render service.Oct 9 2017, 1:13 PM

phuedx updated the task description. (Show Details)Oct 9 2017, 4:46 PM

phuedx updated the task description. (Show Details)

I'm marking this as stalled until we (Reading Web) are ready to proceed with the review.

• bmansurov renamed this task from Security review of mediawiki-services-headless-chromium to Security review of mediawiki-services-chromium-render.Oct 10 2017, 2:37 PM

• bmansurov updated the task description. (Show Details)

phuedx updated the task description. (Show Details)Oct 10 2017, 3:09 PM

phuedx updated the task description. (Show Details)

Per T176627#3673615.

@phuedx, do you mind updating the description to note why Electron needs to be replaced and what problems have been observed? Thanks!

• mobrovac added a project: Services (watching).Oct 12 2017, 4:17 PM

phuedx claimed this task.Oct 13 2017, 12:05 PM

@dpatrick: Thanks for the ping! I've added links to the Services team's tickets tracking having to restart the existing service after it's hung. I've also added a little more reasoning to the description.

phuedx removed phuedx as the assignee of this task.Oct 19 2017, 4:58 PM

phuedx mentioned this in T178077: Security review of Beautiful Soup.Oct 31 2017, 4:37 PM

• dpatrick mentioned this in E767: Security review of mediawiki-services-chromium-render.Oct 31 2017, 7:19 PM

• dpatrick moved this task from Incoming to Scheduled on the deprecated-security-team-reviews board.

• dpatrick mentioned this in T173014: Security review of pdfrw.Oct 31 2017, 8:14 PM

phuedx added a parent task: T181084: [EPIC] Deploy the mediawiki-services-chromium-render service (Proton).Nov 22 2017, 5:31 AM

phuedx removed a parent task: T176627: Trial replacing Electron with headless Chromium in the render service.Nov 22 2017, 8:24 AM

• dpatrick moved this task from Scheduled to In Progress on the deprecated-security-team-reviews board.Dec 13 2017, 5:34 PM

phuedx updated the task description. (Show Details)Jan 12 2018, 10:02 AM

AFAICT this review has stalled. If so, then should I move this back to Scheduled (hopefully somewhere near the top :] )? /cc @Bawolff

@phuedx @Bawolff is there any progress on this? Has the security review been scheduled? Let me know if I can help/assist somehow.

Hi,

Sorry, I think there was a little mix up here due to Darian's departure. I think what happened was that it was marked as next up for Darian to do, but then he left and we forgot to put it back in the list for someone else to do when he left, and he didn't notice that this was left in limbo.

I'm on vacation right now, please send an email @JBennett to sort out scheduling for this task.

phuedx mentioned this in T186748: [EPIC] New service request: chromium-render/deploy.Aug 14 2018, 9:16 AM

phuedx mentioned this in T181623: Chromium-render doesn't handle browser connection abort well.Aug 14 2018, 11:07 AM

Bawolff claimed this task.Aug 27 2018, 2:14 PM

@Bawolff: How's this going? Sorry if I've missed an update elsewhere.

phuedx updated the task description. (Show Details)Sep 10 2018, 5:18 PM

phuedx added a subscriber: • Jhernandez.

In T177765#4571856, @phuedx wrote:

@Bawolff: How's this going? Sorry if I've missed an update elsewhere.

Sorry for the delay, last week had some urgent stuff with the fixcopyright campaign. I am definitely working on it this week.

pmiazga subscribed.Sep 11 2018, 2:40 PM

ovasileva subscribed.Sep 11 2018, 5:15 PM

Bawolff moved this task from Incoming to In Progress on the deprecated-security-team-reviews board.Sep 11 2018, 7:23 PM

Approved, looks good.

Thanks again for your patience on this.

Reading Infrastructure is about to take over the service, and one thing I'd like to get a clearer picture on (asking here as I'm sure this came up during the security review) is what level of network isolation the service is working at. Let's say an attacker can put malicious content into the wiki page, and Chromium executes that while rendering the page, and that causes it to send a bunch of requests. Will those requests be routed through Varnish etc. as if they came from the internet? What happens with e.g. .wmnet URLs?

Tgr mentioned this in T210652: Handoff Proton service to Reading Infrastructure.Dec 13 2018, 9:55 PM

• Jhernandez reopened this task as Open.Dec 17 2018, 7:25 PM

• mobrovac added a project: Platform Team Legacy (Watching / External).Dec 20 2018, 12:01 PM

(forgot to ping @Bawolff when asking that)

• Mholloway subscribed.Jan 9 2019, 5:10 PM

In T177765#4822198, @Tgr wrote:

Reading Infrastructure is about to take over the service, and one thing I'd like to get a clearer picture on (asking here as I'm sure this came up during the security review) is what level of network isolation the service is working at. Let's say an attacker can put malicious content into the wiki page, and Chromium executes that while rendering the page, and that causes it to send a bunch of requests. Will those requests be routed through Varnish etc. as if they came from the internet? What happens with e.g. .wmnet URLs?

This was a little while ago and I don't entirely remember the review in very much detail, but I took another look at the service - and I don't particularly see anything that would stop the service from fetching internal resources like .wmnet urls.

The main good thing on this front is that javascript is not enabled by the service AFAICT. I think the service should probably be using some sort of proxy server that filters out things to inapropriate urls (both internal and external). It should probably set some sort of CSP header on the pages (That's coming to mediawiki in general, but it'd be better if the service explicitly set its own, which could be even more restrictive then MW and be safe from relying on the external stuff setting one properly always).

Additionally, in the config i noticed the following:

puppeteer_options:
  ignoreHTTPSErrors: true
  timeout: 30000
  headless: true
  executablePath: /usr/bin/chromium
  args:
    - '--no-sandbox'
    - '--disable-setuid-sandbox'
    - '--font-rendering-hinting=medium'
    - '--enable-font-antialiasing'
    - '--hide-scrollbars'
    - '--disable-gpu'
    - '--no-first-run'

The --disable-setuid-sandbox, --no-sandbox, ignoreHTTPSErrors: true don't exactly sound reassuring at first glance. I'm not that familiar with chrome sandboxing, so maybe there is legit reasons to do this, but if so it should have some explanatory comments.

Tgr mentioned this in T213362: Limit what URLs Proton can access.Jan 10 2019, 1:15 AM