Page MenuHomePhabricator

(informal) Security Concept Review For LibUp 2.0
Closed, ResolvedPublic

Description

I'm mostly requesting an informal review for this since it's an already-existing project, but given the recent issues with Gerrit account security, more eyeballs are better.

The significant change is that I'd like to have an active SSH agent running on a Cloud VPS instance with access to a +2-privledged Gerrit account.

Project Information

  • Name of project: LibUp 2.0
  • Project home page: https://www.mediawiki.org/wiki/Libraryupgrader
  • Name of team which owns the project: n/a
  • Primary contact for the project: @Legoktm
  • Target date for deployment: End of July
  • Link to code repository: https://gerrit.wikimedia.org/g/labs/libraryupgrader
  • Is this a brand-new project: No
  • Has this project ever been reviewed before: (Phab tasks, etc.): ish. T174760 is the closest. @thcipriani walked through some of the security implications with me on IRC.
  • Has any risk assessment (STRIDE, etc.) been performed: No
  • Is there an existing RFC or has this been presented to the community: Kind of. This is just an evolution of the current libraryupgrader.
  • Is this project tied to a team quarterly goal: "Reduce dependence upon Legoktm because he's getting busier" but it's not a real goal :-)
  • Does this project require its own privacy policy: No

Description of the project and how it will be used

See https://www.mediawiki.org/wiki/Libraryupgrader/2.0

Description of any sensitive data to be collected or exposed

Mostly it'll have access to a +2-enabled Gerrit account.

Technologies employed

Python (flask/celery), docker, rabbitmq, systemd, Cloud VPS.

Dependencies and vendor code

see https://gerrit.wikimedia.org/r/plugins/gitiles/labs/libraryupgrader/+/master/setup.py

Working test environment

The libraryupgrader VPS project is mostly set up - I just haven't finished writing all the code. I can give people access as requested.

Event Timeline

Legoktm created this task.Jul 11 2019, 8:09 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 11 2019, 8:09 PM
sbassett triaged this task as Normal priority.Jul 12 2019, 1:43 PM
sbassett claimed this task.Jul 16 2019, 5:21 PM
sbassett moved this task from Backlog to In Progress on the Security-Team-Reviews board.
sbassett removed a subscriber: sbassett.
sbassett added a project: Restricted Project.Jul 16 2019, 5:34 PM

Will try to look at this soon and get some comments/questions up. Would also love for @Reedy to do the same if he has a minute, once he's back next week.

sbassett added a comment.EditedJul 24 2019, 10:10 PM

Hey @Legoktm -

Thinking through the security concerns:

  1. I was wondering if there's precedent for the ssh agent piece under Cloud VPS, i.e. if anyone has done or is currently doing anything like this for another project. I assume Cloud-Services folks are fine with this sort of setup if reasonable security best practices are being followed. Also assuming the ssh key passphrase would still be stored within your password manager (or someplace secure) for backup purposes.
  2. "it's up to the person doing the whitelisting to review diffs since the previous version" - will this still be you for now until more libup maintainers are recruited? I obviously trust your spot-checks though I understand that doesn't scale. Wondering if there's any automation which could explored to make this a bit less painful (vuln dbs [though many of these library upgrades are probably triggered by that], static analysis tools, obvious evil patterns, etc.)
  3. "An attacker would need to inject content into these files OR subtly trick humans into +2'ing changes to other files." - does libup perform any spoofing checks? Like running a composer/npm install/update and ensuring the lock files (maybe hashes of them) are as expected?
  4. Has a means to throttle libup ever been considered? Just thinking through recent gerrit/diffusion mass vandalism attacks - not sure if that's a possible attack surface here or not.

Otherwise I think libup 2.0 sounds pretty sane to me. Not sure if @Reedy has any additional thoughts.

Hey @Legoktm -
Thinking through the security concerns:

  1. I was wondering if there's precedent for the ssh agent piece under Cloud VPS, i.e. if anyone has done or is currently doing anything like this for another project. I assume Cloud-Services folks are fine with this sort of setup if reasonable security best practices are being followed. Also assuming the ssh key passphrase would still be stored within your password manager (or someplace secure) for backup purposes.

To the best of my knowledge, no. The only other +2 bot is l10n-bot, and that runs from the translatewiki server. cc @bd808 for the cloud services opinion. My understanding is that only project members (currently just me, and all new maintainers would also need be +2'ers themselves) and cloud services roots (all very trusted people / already have +2) have access to the instance. I haven't fully thought out the attack scenarios against cloud services post-Spectre, but I'm not aware of any gaping holes.

  1. "it's up to the person doing the whitelisting to review diffs since the previous version" - will this still be you for now until more libup maintainers are recruited? I obviously trust your spot-checks though I understand that doesn't scale. Wondering if there's any automation which could explored to make this a bit less painful (vuln dbs [though many of these library upgrades are probably triggered by that], static analysis tools, obvious evil patterns, etc.)

No, and this step can be done by any mediawiki.org administrator (so far James F has added some stuff). I didn't want to implement authentication/authorization myself, so I'm using MediaWiki for it.

I think we definitely could use some more automation/analysis here, but I don't feel like libup is a good place to develop/implement these tools. Rather they should just be part of CI or some other service that does checks against dependencies.

I do think that libup could probably do a basic check to ensure we're not introducing any new vulns when upgrading stuff.

  1. "An attacker would need to inject content into these files OR subtly trick humans into +2'ing changes to other files." - does libup perform any spoofing checks? Like running a composer/npm install/update and ensuring the lock files (maybe hashes of them) are as expected?

Yes, but not in an effective manner. Theoretically npm ci does those kinds of verifications, but it also executes those packages, theoretically allowing malicious code to run after our check. Any such check would have to be purely our trusted code, at which point we're reimplementing parts of composer/npm that I'd rather not.

  1. Has a means to throttle libup ever been considered? Just thinking through recent gerrit/diffusion mass vandalism attacks - not sure if that's a possible attack surface here or not.

libup is throttled, but the primary intention is to avoid overloading CI. It checks that there are no more than 3 patches going through the test and gate-and-submit queues before pushing another patch to Gerrit. This usually also gives humans priority in CI time. In practice, it takes about 14-18 hours for a full libup run if all 900+ repositories are touched. Weekends will usually run faster because of less human traffic.

Otherwise I think libup 2.0 sounds pretty sane to me. Not sure if @Reedy has any additional thoughts.

\o/

No, and this step can be done by any mediawiki.org administrator (so far James F has added some stuff). I didn't want to implement authentication/authorization myself, so I'm using MediaWiki for it.

I'd probably prefer this was a bit more restricted, though I assume mw sysops are active/trusted enough and various automated checks and sets of eyes on gerrit patch sets will ensure a reasonable amount of security here.

I think we definitely could use some more automation/analysis here, but I don't feel like libup is a good place to develop/implement these tools. Rather they should just be part of CI or some other service that does checks against dependencies.
I do think that libup could probably do a basic check to ensure we're not introducing any new vulns when upgrading stuff.

That's fair.

libup is throttled, but the primary intention is to avoid overloading CI. It checks that there are no more than 3 patches going through the test and gate-and-submit queues before pushing another patch to Gerrit. This usually also gives humans priority in CI time. In practice, it takes about 14-18 hours for a full libup run if all 900+ repositories are touched. Weekends will usually run faster because of less human traffic.

Ok, I guess any security concerns here could piggyback on this throttling. Though this doesn't speak well of zuul as a potential attack surface :/

No, and this step can be done by any mediawiki.org administrator (so far James F has added some stuff). I didn't want to implement authentication/authorization myself, so I'm using MediaWiki for it.

I'd probably prefer this was a bit more restricted, though I assume mw sysops are active/trusted enough and various automated checks and sets of eyes on gerrit patch sets will ensure a reasonable amount of security here.

Another option I've been thinking about since yesterday, we could keep the JSON config file in a Gerrit repository, that is owned by mediawiki, and have libup auto-pull that before it runs. That lets us piggyback off of the Gerrit authorization/permissions scheme, but we still have the history/attribution that the wiki page gives us. Would that address your concerns?

I assume Cloud-Services folks are fine with this sort of setup if reasonable security best practices are being followed.

Speaking in the most broad terms, Cloud Services does not proscribe the use of ssh private keys from inside Cloud-VPS projects. We do urge anyone forwarding a local agent into the environment or hosting a key directly on a Cloud VPS instance to assume that there are hostile parties with access to the same instance and network and act accordingly.

In this particular case, @Legoktm controls the library-upgrader project, and is currently the only user (without global root powers) who has access to the instances there. Barring compromise of the virtual machine instance via a remote network exploit or a cross tenant escalation (instance in another project finds a way to tunnel across to instance in this project via a hypervisor breakout to the hosting hardware) any key material should be reasonably secure. "Trusted member group" is a key part of this as anyone with ssh access to the instance gains a whole new set of possible escalation paths for exfiltrating key material or using the key material directly from the instance. Stated more clearly for folks who may be following along but not used to the pessimism of written risk assessments, I think that a Cloud VPS instance in a project with a a trusted member group is no less (or more) secure than any other VPS from any other hosting provider. So if you would be comfortable with AWS, Rackspace, OVH, etc hosting this project that trust should translate to Cloud VPS hosting.

Another option I've been thinking about since yesterday, we could keep the JSON config file in a Gerrit repository, that is owned by mediawiki, and have libup auto-pull that before it runs. That lets us piggyback off of the Gerrit authorization/permissions scheme, but we still have the history/attribution that the wiki page gives us. Would that address your concerns?

I'd prefer this approach given that mediawiki is a smaller group of trusted contributors and git makes more sense to me as a canonical store of config-related files.

Ok, thanks for the input, @bd808!

@Legoktm - Is there anything else to follow up on here? If not, I'm fine with resolving this task. If you'd like a more formal security (read: code) review down the road, you can definitely submit a new request for that.

Legoktm closed this task as Resolved.Jul 30 2019, 5:18 PM

I think we're all good. To recap:

Thank you for the review!

sbassett moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Tue, Oct 29, 3:53 PM