Page MenuHomePhabricator

RFC: Streamlining Composer usage
Closed, DeclinedPublic

Description

https://www.mediawiki.org/wiki/Requests_for_comment/Streamlining_Composer_usage

Next steps in any order:
T125343: Upgrade integration/composer to 1.6.5 stable
Create a CI job that creates an update to a branch including a build of vendor (T101123) that can be deployed to beta which is similar to what for production a new wmf branch normal is done for. See the work flow example in the RFC.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-08-05-21.01.html
Actions:

  1. get csteipp and TimStarling to agree on what would be needed to trust automated downloads
  2. jzerebecki to refine requirements and threat models
  3. TimStarling to schedule another meeting on this RFC in 2 weeks

Re Action 1, the discussion that happened in T101123#1512379 after the meeting might help.

1 & 2 are probably related, so I'll add some comment here. Happy to move to another forum if needed.

The threats I've seen laid out, and my (very rough) evaluation of their risk. Happy to be corrected if it seems like I have assumptions that are wrong, or you disagree with my likelihood/impact score.

  1. MITM between composer and packagist - in my estimation, this has both High likelihood and High impact. It's a trivial technical attack to pull off without composer checking certificates, and allows the attacker to link their own, modified version of any library into the composed codebase.
    1. Risk: (using high = 3, med = 2, low = 1, and risk = likelihood x impact ), then 3 x 3 = 9
    2. Mitigation: Currently none (adding https certificate checking in composer and requiring all connections to be https shouldn't be too difficult to add and upstream)
  2. MITM between composer and packagist with valid packagist certificate - In the event that composer start validating the https connection's certificate, is that enough? In my estimation, this is a fairly difficult attack to pull off, for a someone attacking the WMF's infrastructure, so I put the likelihood at Low. For normal developers running composer update on their laptop, this is still moderate, since buying an SSL MITM proxy that contains a code-signing certificate is fairly expensive still, so I'm going to say likelihood is Low to Medium. This has the same impact as #1.
    1. Risk: Low-Medium (1.5) x High (3) = 4.5
    2. Mitigation: Currently none (adding certificate pinning to composer would me moderately expensive, since it doesn't seem that composer would maintain that upstream)
  3. Github is compromised (the entire organization) - in my estimation, compromising github would be a difficult. They have a security team that seems to be making reasonable choices currently. So I'd guess the likelihood is Low. The impact, aiui, would be High for normal composer usage (where just the tarball of a repo at a certain commit is downloaded, and does not appear to be integrity checked). If composer does get a checksum that it checks, or composer is setup to clone the repo and checkout a sha1-hash (which seems to be hard to forge), the the impact would be reduced.
    1. Risk: Low (1) x High? (3) = 3?
    2. Mitigation: Currently none (adding certificate pinning to composer would me moderately expensive, since it doesn't seem that composer would maintain that upstream)
  4. Packagist is compromised (the entire organization) - Packagist concerns me a little more, since I don't know anything about their operational security. I'll do some quick research on that, or if anyone has a citation, happy to evaluate. The impact would again be High, because a attacker could add any code to any library by pointing to their own version of it.
    1. Risk: ? x High (3) = ?
    2. Mitigation: unknown
  5. Github repo is "compromised" by owner accepting hostile pull request - The likelihood seems Low to me, since if we've determined the library is of sufficient quality to include in our codebase (A MediaWiki developer has decided this is a good library to include, and the library has passed security review by my team), then I (hope) it's unlikely they would accept a malicious pull request. If it did happen, impact would be High, however.
    1. Risk: Low (1) x High (3) = 3
    2. Mitigation: Vetting of libraries included in MediaWiki (developer review, security review), updates to vendor are code reviewed
  6. Github repo is compromised by owner having their password stolen - Github mitigates this with things like mandatory https with HSTS, using OAuth to interact with other services, optional 2-factor authentication for accounts, review and expiration of potentially unsafe ssh keys. However, assuming may library owners are not going to take advantage of the optional security features, we can probably call the likelihood Medium, and the impact High, since it would allow the attacker to add any code to the repo.
    1. Risk: Medium (2) x High (3) = 6
    2. Mitigation: Updates to vendor are code reviewed

Currently I don't have enough time to catch up to the existing input. So I think we should wait with discussing this RFC again.

Qgil added a subscriber: Qgil.Sep 16 2015, 10:16 AM

A message to all open tasks related to the #Wikimania-Hackathon-2015. What do you need to complete this task? Do you need support from the Wikimedia Foundation to push it forward? Help promoting this project? Finding an intern to work on it? Organizing a developer sprint? Pitching it to WMF teams? Applying for a grant? If you need support, share your request at T107423: Evaluate which projects showcased at the Wikimania Hackathon 2015 should be supported further or contact me personally. Thank you!

My preferred way forward here is adding signature support to composer and packagist.org, that would if the signature implementation is not broken and the persons trusted with signing are not compromised prevent all from T105638#1515362 except 5.

It might, if I have time to write up a sketch/plan how to implement this in composer upstream and if the right people are interested and there.
The right people would be the people doing the branch cut before deployment, doing the deployment, creating composer components, using composer components in Mediawiki.

Qgil added a comment.Sep 16 2015, 9:50 PM

Thank you for proposing a session for the Wikimedia Developer Summit. Please complete your description -- see "Expected fields" at https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit_2016#Call_for_participation

@Qgil as far as I see it is complete, except that I need to update the RFC document.

definition of the problem

See RFC.

expected outcome at the Summit

Progress on the RFC. It would be nice if the WMF would make it a goal, but that is not strictly an expected outcome.

current status of the discussion

See RFC. But needs update.

links to background information

See this tasks blocker and the RFC.

related tasks in Phabricator

See this tasks blocker and the RFC.

Qgil added a comment.Oct 3 2015, 8:57 PM

Congratulations! This is one of the 52 proposals that made it through the first deadline of the Wikimedia-Developer-Summit-2016 selection process. Please pay attention to the next one: > By 6 Nov 2015, all Summit proposals must have active discussions and a Summit plan documented in the description. Proposals not reaching this critical mass can continue at their own path out of the Summit.

@Qgil I haven't seen any response from you, so I took the liberty to change the column on the work board.

We held E85: RFC Meeting: Streamlining Composer usage (2015-11-04), see Meeting summary. From that:

  • need to check if fingerprint in git verify-tag/commit output depends on gpg settings (jzerebecki
  • AGREED: signed tag support in composer would be nice to have (TimStarling, 22:52:22)
  • rough consensus that everything sucks and our lives will be horrible regardless of which solution is implemented (TimStarling, 22:56:13)
  • @JanZerebecki will add more detailed full workflow into RFC

Here is an idea for a workflow-based solution that would work for nodejs as well:

  1. Each code project has a corresponding deploy repository. For nodejs, current practice is to have the code as a submodule of the deploy repository (in src/). For MediaWiki, current practice is to have a deploy / dependency repository inside the code repository (vendor/). It might be worth investigating if inverting the relationship could be an option for MediaWiki as well, as this avoids deploy updates polluting the code repository history.
  2. CI automatically updates the deploy repository for each test run by running composer / npm, and commits the result to git on successful test completion. The deploy repository hash is recorded in the test results.
  3. For a deploy, one of the CI-prepared deploy repository commits are reviewed and merged. The diff clearly shows changes in dependencies. A potential issue here is making sure that the submodule patch is actually merged, but it seems that this could be solved with a hook.

This workflow is very close to what we are currently doing for node services, except that step 2) is currently performed manually, using a docker script.

@GWicke: very interesting suggestion. I like it, though I think I need to think it over a bit more before I can say for sure that it would work for mediawiki deployments.

TODO: See if something better than plain gpg1/2 verify exists and if git verify can be enhanced to verify if the key that made the signature is trusted, e.g. to have less of the implementation in composer and make it possible to share this with other package managers.

Tgr added a subscriber: Tgr.Nov 27 2015, 9:45 PM

The librarization project is somewhat blocked on this (or rather on the Composer security problems outlined here), see comments in https://gerrit.wikimedia.org/r/#/c/248661/

Copy of the etherpad:

(missing start of talk, a minute or so)

Step 4: CI job that automatically triggers after changes to core or ext
* run composer update and push changes

Step 5: Update BC from steps 3-4

Are these sufficient examples? Can people imagine difficulties?

* If files are added to vendor, how do we deal with this in code review/security patches?
** Deployed as a security patch
* We would keep this strategy, no automatic composer run

Questions about the RfC in general?

* Bryan: one of the issues we got to when discussing this last was which use case is being optimized

* Relying on GPG tags for repos doesn't cover many vendor things
** System that Jan is proposing is a Debian style web of trust to composer usage where we rely on GPG hashes to authenticate code
** For composer things that aren't in that web of trust there would need to be a legacy system. Maybe run mirrors. Is that least sane?
** There's an inherent problem with trust if you're downloading vendor packages (?) If we want to make sure we're only using code we intended to, we need to sign it

* Is there an open upstream issue for this?
** Basic sentiment is that composer is not designed to support
** all issues regarding verification/authentication of packages are lumped together
*** a lot of ideas and no direction
*** Some like Jan have given a proof of concept but it went nowhere
** Not a lot of interest upstream

There are components already that are signed
* e.g. phpunit

We would need to sign some vendor components but most are ours already

Can we just run some proxy for upstream components?
* Same problems with verification

What do you do now?
* I don't update vendor.

Main driver use case is WikiData/WikiBase
* double code review problems

Problem with moving to composer components is the overhead of more work
* much easier to develop a new MW ext than to create a component
* making this easier is key to adopting new conventions

Counter-proposal was drop GPG
* From security perspective would be insufficient. Jan wouldn't want to advertise this solution

Was proposal to start doing tag signing? What?
* Automate vendor updates using signed git tags, modifying composer
** control of keys etc.
* Would it bea Jenkins job? Yes, updates to core or anything that has composer.json
** Possibly before BC deployment

There are packages we don't control and fundamental rewrite of composer
* It's not a total rewrite. There are a few places to change to support this (around finishing of download; introduce verification)
** There are problems current with the implementation of verification in composer (succeeds when it shouldn't)

Getting support into composer is hardest part but signing should be easier
* Signing might be easier for repos we control but not 3rd parties
* We can trust 3rd party keys or make clone (mirror) of repos and add signing following security review
* e.g., phpunit
** on WMF servers we'd maintain clone of source repo
** sign it ourselves (after review?)
** in MW vendors composer.json use stanza for copying it from our mirror

We could package?
** When debian packager packages something (e.g. phpunit), where do they get the code form?
*** Disconnect. You can verify at the point of installation once or have to verify it every time. Verifying every time has more potential for man-in-the-middle attacks

We have these same problems for NPM modules in *oids
* You find crazy things like libssl in vendor

Python/pip as well.
* Verification isn't required unless you tell it

In packaging (Debian for example)
* Every package you install has been verified
** Biggest problems are that packages are old, or don't exist

Jan is trying to solve WMF specific problem
* What does he need? 
** Someone to build it.

From security perspective, what are you solving?
* Attack vector?
* Requiring HTTPS and checking validity of cert is a start
* Verifying packages is a step further. Requires more work in signing
* Starting with a patched version of composer that requires HTTPS

Tags are a terrible thing to trust. Trust only SHA1s

For everything you're going to pull in from composer, fork/mirror
Qgil removed a subscriber: Qgil.Jan 7 2016, 2:13 PM

Wikimedia Developer Summit 2016 ended two weeks ago. This task is still open. If the session in this task took place, please make sure 1) that the session Etherpad notes are linked from this task, 2) that followup tasks for any actions identified have been created and linked from this task, 3) to change the status of this task to "resolved". If this session did not take place, change the task status to "declined". If this task itself has become a well-defined action which is not finished yet, drag and drop this task into the "Work continues after Summit" column on the project workboard. Thank you for your help!

JanZerebecki raised the priority of this task from Normal to High.May 2 2016, 9:33 AM
JanZerebecki added a subscriber: adrianheine.

T133995#2254431 @adrianheine I upgraded composer in our CI. Next T125343 we probably need to figure out the target environment .

RobLa-WMF mentioned this in Unknown Object (Event).May 4 2016, 7:33 PM
JanZerebecki updated the task description. (Show Details)Jun 10 2016, 2:38 PM
JanZerebecki lowered the priority of this task from High to Lowest.Jul 8 2016, 2:52 PM

Adjusting priority to what I perceive to be the actual demand. Sorry, overestimated the amount of my free time.

RobLa-WMF removed JanZerebecki as the assignee of this task.Jul 8 2016, 7:28 PM
RobLa-WMF raised the priority of this task from Lowest to Needs Triage.
RobLa-WMF moved this task from Backlog to Inbox on the TechCom-RFC board.
RobLa-WMF added a subscriber: RobLa-WMF.

This seems worth discussing in the ArchCom RFC triage process (discussion: Z425 or wikitech-l)

RobLa-WMF moved this task from Inbox to (unused) on the TechCom-RFC board.Jul 20 2016, 6:14 AM

We discussed this at E227 last week, and @Krinkle suggested a more substantive response to @csteipp's T105638#1515362 review would be helpful.

This also needs a shepherd to move forward. @daniel, is that something you can take on?

Shepherd assigned in E236

daniel moved this task from (unused) to Backlog on the TechCom-RFC board.Sep 7 2016, 8:38 PM
daniel removed daniel as the assignee of this task.Sep 22 2016, 5:58 PM

stalled, dropping shepherdship

Tgr added a comment.Sep 22 2016, 9:25 PM

Is this blocked on something specific or just lack of interest? As I understood the original blocker was proper HTTPS support in Composer and that was fixed upstream some months ago.

Lack of commitment, lack of resources. I still think it would be a good idea to do it, but i'm not sure that counts as "interest".

We should get rid of mediawiki/vendor at some point...

Just going to poke this to see what people consider the state of this?
Personally I think out composer usage is rather streamlined right now (except in core we have super strict version dependencies)

Krinkle closed this task as Declined.EditedOct 11 2017, 8:47 PM
Krinkle triaged this task as Normal priority.

Personally I think out composer usage is rather streamlined right now

Indeed. Our conventions around how Composer should be used in MediaWiki development, release and deployment have settled. I don't think we're in an ideal or perfect situation but it seems to be working well enough right now and meets the original criteria for adopting Composer

However, the RFC about agreeing how to use Composer (initially) is a separate one:https://www.mediawiki.org/wiki/Requests_for_comment/Composer_managed_libraries_for_use_on_WMF_cluster

Despite its generic name, this RFC (T105638) was not so much about setting up how we use Composer, but changing how we use Composer to be different. Specifically to address complexity with regards to merging composer.json files from multiple sources whilst still making it easy and consistent to test and install separately. This was mainly for the use case of the Wikidata build, which is being changed to not have its own vendor install anymore, thus making this an RFC without a shepherd/stakeholder.

The idea of automating mediawiki/vendor and/or to move the Composer step to be during CI or deployment, instead of in Git, remains interesting for the future. But following a reevaluation from TechCom, I'm closing this RFC given the original use case is becoming obsolete, and there is no longer resourcing from the affected teams for this problem.

Affected teams (e.g. Release Engineering or MediaWiki Platform Team) are free to create a new RFC at any time, possibly borrowing some of the ideas from this RFC.