Page MenuHomePhabricator

Service containment for nodejs-based services with firejail
Closed, ResolvedPublic

Description

Evaluate https://l3net.wordpress.com/projects/firejail/ as a solution to sandbox potentially risky services in terms of performance tradeoff, implementation quality and flexibility.

Event Timeline

MoritzMuehlenhoff claimed this task.
MoritzMuehlenhoff raised the priority of this task from to Medium.
MoritzMuehlenhoff updated the task description. (Show Details)

I tried firejail with several applications are overall it works well. It's fairly straightforward to provide an isolated namespace for a risky service (with restricted capabilities and a seccomp filter for harmful syscalls). I tried example exploits for recent privilege escalations in fuse and the Ubuntu kernel and both are no longer exploitable in a firejailed envionment with the seccomp filter.

firejail heavily relies on facilities in the Linux kernel which is a sound design decision. There are still occasional security issues in the namespaces implementations in the Linux kernel, but espcially seccomp has proven to be solid with no bypasses since 2009 (at least on x86, there was a MIPS-specific one last year).

To estimate the performance tradeoff I did some some benchmarks (Iceweasel with Kraken Benchmark, timed compilations, mplayer -benchmark), but the performance loss is hardly measurable (up to max. 3%).

firejail has unique features to for desktop use. It makes it really simply to run untrusted binaries on your desktop.

For isolating services systemd offers many options to achieve the same isolation throughout unit files (e.g. InaccessibleDirectories, ReadOnlyDirectories, CapabilityBoundingSet, SystemCallFilter). In Debian unstable the first services have started to pick up some of these hardening options (e.g. tor recently).

Starting with Debian stretch systemd will likely be the superior solution to contain risky services, but at the moment two important factors are in favour of firejail:

  • We have a significant number of trusty installations (so w/o systemd) and migrating them to Debian will take it's time, while we can use firejail on trusty (We won't be able to use firejail on precise, it needs Linux 3.5 minimum for seccomp and namespace support has come a long way since Linux 3.2 )
  • systemd in Debian jessie doesn't have seccomp support enabled, so we'd need to carry a local patched version in jessie (it was enabled in Debian unstable only last week)

Firejail was stuck in the Debian NEW queue (where packages are reviewed before inclusion into the archive) for half a year and was only accepted in the archive today
(As such, the version is outdated by now, but I suppose that'll change soon).

As the next step I'll prepare a 0.9.26 build for apt.wikimedia.org and firejail an existing service to see how smooth it works out in production. Any suggestions for a candidate service?

Excellent news!

As the next step I'll prepare a 0.9.26 build for apt.wikimedia.org and firejail an existing service to see how smooth it works out in production. Any suggestions for a candidate service?

Actually, 4 for now. zotero, citoid, mathoid. The first one is xulrunner based, the other 3 nodejs services. And I 've got more suggestions if all pans out well.

Yeah, zotero/xulrunner is probably the highest risk service right now.

There are the image scalars too, which may be even higher risk. Those are
invoked by shelling out from mediawiki, so not sure how firejail works on
that scenario.

Change 219177 had a related patch set uploaded (by Alexandros Kosiaris):
Allow optional firejail containment for nodejs services.

https://gerrit.wikimedia.org/r/219177

Change 219331 had a related patch set uploaded (by Muehlenhoff):
Enable firejail for mathoid

https://gerrit.wikimedia.org/r/219331

MoritzMuehlenhoff renamed this task from Evaluate firejail to Service containment for nodejs-based services with firejail.Jun 19 2015, 1:06 PM
MoritzMuehlenhoff set Security to None.

Tests with mathoid on deployment-mathoid were successful (both the standard settings and the not-yet-enabled-in-prod Java-based PNG generation).

This looks like a good step towards containers, even if it doesn't give us filesystem / userspace isolation and deployment-related functionality yet.

Regarding your points in favor of firejail, I think only the second one really matters for services. I am not aware of any node service that functionally requires trusty on the underlying hardware node; the main reason why some hosts aren't on jessie yet is that we haven't gotten to upgrading them yet. This situation does have some costs (need to still support upstart for some services, inability to start leveraging systemd), so it might be worth considering expediting the upgrade, rather than investing more time into trusty support.

the main reason why some hosts aren't on jessie yet is that we haven't gotten to upgrading them yet.

For reference, that'd be T96017: Migrate SCA cluster to SCB (Jessie and Node 4.2)

To illustrate why we are interested in eventually having support for systemd containers with images: In the last two days we ran into two (still unresolved) instances where our staging cluster and production behave differently, despite both running jessie.

  1. metrics stop working in production when we upgrade to cassandra 2.1.6, but are working fine with 2.1.6 in staging
  2. restbase memory limiting is seemingly broken in production, but the same code works fine when load testing in staging

Change 219177 merged by Muehlenhoff:
Allow optional firejail containment for nodejs services.

https://gerrit.wikimedia.org/r/219177

Change 219331 merged by Alexandros Kosiaris:
Enable firejail for mathoid

https://gerrit.wikimedia.org/r/219331

Change 226273 had a related patch set uploaded (by Muehlenhoff):
Remove firejail conditional

https://gerrit.wikimedia.org/r/226273

Change 226273 merged by Muehlenhoff:
Remove firejail conditional

https://gerrit.wikimedia.org/r/226273

firejail is now enabled by default for service::node