Page MenuHomePhabricator

git/http operations in scap should be secure
Open, MediumPublic

Description

Since our git operations (over http) go cross-dc, we should secure them.

Event Timeline

Isn't all inter datacenter traffic encrypted via IPSet tunnels?

From the parent task, "We currently have IPsec deployed for some limited use cases. "

Sounds to me it should be a service provided by the infrastructure. Ie all of our crap should not have to care about encrypting. That ends up being way simpler since you can deploy wtf you want and have a guarantee by the underlying layer that encryption is achieved.

Sounds to me it should be a service provided by the infrastructure. Ie all of our crap should not have to care about encrypting. That ends up being way simpler since you can deploy wtf you want and have a guarantee by the underlying layer that encryption is achieved.

From Faidon in a related task (same thing, but for Cirrus/ElasticSearch)

I realizes it's a ton more work, hardware, and I honestly don't even know what would be involved. i wonder though if the application level is the right level to encrypt at? Would it instead make more sense to encrypt everything at the network level, such that data leaving eqiad gets encrypted, and entering codfw it gets decrypted?

This is a good question — we've deliberated this a number of times, asking that question ourselves (and some of those times, I've been on your side too :).

There are a few compounding factors for not going via such a route:

  • A network-level encryption would mean that instead of distributing the crypto load to hundreds of commodity servers, we'd create a few chokepoints that would need to be quite beefy and thus much more expensive. Realistically we're talking about the so-called multiservices PICs for our Juniper hardware that would do IPsec crypto across all of our 10G network core, amounting to many hundreds of thousands of dollars both initially and in the long-run, as these would need to be refreshed every few years, would need expensive service contracts to maintain and we would need to procure these for each new network gear we'd buy (e.g. for a new PoP).
  • Creating a few chokepoints makes us more vulnerable, as an adversary would need to only attack those and gain access to all encrypted traffic (security SPOFs). Moreover, as these would be proprietary hardware, they'd eventually suffer from poorer crypto, as vendors typically lag behind or would charge extra for newer hardware, poorer automation (PKI/key rollovers etc.) as well as potentially from unaudited, potentially backdoored code (see the recent revelations about multiple backdoors in Juniper ScreenOS firewalls/VPN gateways and the backdoor on Fortinet firewalls, as well as the older “Cisco upgrade factory” Snowden leak).
  • Finally, from a security design perspective, it's dangerous to consider the network safe and only perform crypto (or authentication) at the outer layers; IPsec/MACsec crypto across our datacenter links would protect us from fiber tapping, but would do little to protect us from attacks within the same datacenter (a compromised switch, or a compromised server performing man-in-the-middle between two other servers etc.). Performing end-to-end crypto between trusted endpoints is a much safer way to approach this.

So, all in all, we've come to the conclusion that it would make sense for Wikimedia on both a technical, financial and philosophical basis, to rely on existing, FL/OSS solutions (namely: IPsec + multiple TLS implementations) and do this in software, using cheap, existing hardware (AES-NI instructions are standard in current CPUs, and with it the cost of crypto is negligible, as proven by our HTTPS-only switch) and automation/orchestration. We stil have some way to go (we still lack a good internal PKI, Varnish doesn't support TLS natively, ElasticSearch/this task etc.) but these are mostly one-off costs that are for sure cheaper than the alternatives.

So we've got two options going forward, neither of which are terribly hard.

  1. We can generate some certs and slap them on the apache instance we use for git operations. Fairly trivial, just some config swaps.
  2. If we go with T116630: Remove apache dependency from scap3 deployment host, it needs to support TLS (and per discussion with @mark, probably only TLS)

We discussed this in the deployment meeting today. @faidon suggested one option is for scap to deploy ephemeral tls keys to the proxy nodes and then start up the git-https process.