Essentially, this task is about making https://appservers.svc.eqiad.wmnet/ and similar work internally. Ideally we do this for all internal service endpoints eventually, but MediaWiki is the biggest target to go after for the initial work of sorting this out. Host: headers can still be used to make requests to e.g. en.wikipedia.org over these TLS connections, but the cert should match the hostname used for the TCP connection in this case, IMHO.
In certificate terms, we need to create keys and issue certs (signed by the internal WMF CA that we already trust) for the virtual service hostnames. Probably the first step there is to come up with a definitive list of which clusters need which service hostnames (e.g. should the same broad pool of mw* machines have a SAN cert that covers both appservers.svc and api.svc? etc).
There are two basic obvious approaches to configuring the TLS listener:
- We could deploy the puppetized tlsproxy nginx configuration (with slight ammendments) as an inbound TLS proxy that talks to apache. The advantage here is simplicity of a known-good solution in configuration terms. The downside is it adds another layer to the overall request-processing stack, which reduces uptime and complicates debugging, etc.
- We could configure the TLS listening part in the appservers' current apache instance. This is much better in terms of runtime/debugging complexity, but I'm not yet sure how difficult it would be to implement in apache terms (having TLS with a cert matching the virtual service hostname, which applies to all of the (~70?) VirtualHost declarations in the current apache-level configuration.
Eventually we'll want to use TLS Client Auth with this as well, but we could do that as a second step after the initial one-way auth. Even without client auth, we're gaining significant resistance to passive snooping.