Page MenuHomePhabricator

enable https for (ubuntu|apt|mirrors).wikimedia.org
Closed, ResolvedPublic

Description

http://mirrors.wikimedia.org/ works

https://mirrors.wikimedia.org/ does not

Ditto for the other two hostnames, and they're all currently hosted on carbon.wikimedia.org

This is one of our last important cases for wikimedia.org hostnames that lack HTTPS, as mentioned on T132521#2202250

Related Objects

StatusSubtypeAssignedTask
ResolvedBBlack
ResolvedBBlack
ResolvedBBlack
ResolvedNone
DeclinedKrinkle
ResolvedJgreen
ResolvedChmarkine
ResolvedBBlack
ResolvedBBlack
ResolvedDzahn
Resolved ezachte
ResolvedBBlack
ResolvedBBlack
ResolvedBBlack
ResolvedBBlack
ResolvedBBlack
ResolvedBBlack
ResolvedBBlack
ResolvedDzahn
ResolvedBBlack
ResolvedBBlack
ResolvedBBlack

Event Timeline

BBlack renamed this task from enable https for mirrors.wikimedia.org to enable https for (ubuntu|apt|mirrors).wikimedia.org.Apr 14 2016, 1:27 PM
BBlack updated the task description. (Show Details)

Is this just about adding https or also about enforcing it?

and would this be about a certificate on carbon or can it be behind varnish even though it's an APT mirror?

It's about enabling HTTPS with valid certificates for all 3, but not about enforcing it with a redirect (or anything else more advanced). We actually can't enforce a redirect at this time, as it would break some package-fetching clients.

I'm not sure yet what the best way is to proceed.

Thinking out loud:

  1. Put it behind cache_misc the simplest way
    • We'd create a new IP for this, assign it to cache_misc, map the 3x service hostnames to that new IP on cache_misc and backend them to carbon.wikimedia.org:80 (actually it doesn't even have to be a new IP in this case...)
    • We could avoid thrashing the caches by having them be pass-only.
    • The primary problem with this is the chicken-and-egg dependency stuff. Newly-installed cache_misc nodes would probably try to use themselves (configured on loopback) for the mirror IP, to install the packages needed to make themselves actually function correctly for that purpose (like installing our nginx package). Maybe we can fix puppet dependency ordering to ensure that the lvs::realserver IPs are the dead-last thing configured or something similar? Also, I don't think LVS servers can actually contact LVS virtual services, so how would they get package updates? Perhaps in both cases, another workaround could be that on LVS and cache_misc hosts, an /etc/hosts entry maps apt|mirrors|ubuntu hostnames to carbon.wm.o's direct IP address like today.
    • Pros: standardized TLS termination, no buying/maintaining more certs, might eventually actually have it cache some things and speed up installs in remote DCs?
    • Cons: dealing with chicken-and-egg above
  1. Put it behind cache_misc only for public access, change to internal service hostnames for internal HTTP-only use.
    • Some kind of scheme where we give carbon an internal IP too (via vlan trunking or a separate interface), and move all internal access to use (apt|mirrors|ubuntu).eqiad.wmnet or some such internal hostname/IP, which is HTTP-only, and then pass only the public hostnames through cache_misc for TLS termination, routed to those internal service hostnames.
    • Cons: Sounds like a mess to make the switch
  1. LVS splits 80/443 routing between cache_misc/carbon.
    • It's a lot like the first option, but definitely needs a new separate IP, and the LVS service would be configured to send port 443 for the new IP to cache_misc for TLS (which backends to carbon), and port 80 directly to carbon itself.
    • This kills the cache_misc chicken-and-egg so long as we're fetching packages over HTTP like we do today, I think
    • We'd still have the problem of LVS not reaching LVS IPs, to solve with /etc/hosts hack or similar, but on far fewer hosts.
  1. Buy a cert with the 3x SAN names for this, maintain/install it separately on carbon.
    • Pros: pretty straightforward
    • Cons: cash outlay, long-term maintenance burden of SSL certs
    • This is probably the sanest option
  1. Re-use our big primary production SAN cert on carbon directly
    • Pros: even more straightforward than above, and no new costs
    • Cons: security risk! We've never done this before, exactly because it's an additional sec risk to the private keys.

I think we should go with 4. short term. Then for mid/long-term maybe we want to have this redundant, one in each DC and that could possible solve the chicken-egg problem so we can get to 1. Or , letsencrypt or our own CA for this, whichever comes first.

I suggest we use Let's Encrypt. It can issue SAN certificates.

Can I get a certificate for multiple domain names (SAN certificates)?
Yes, the same certificate can apply to several different names using the Subject Alternative Name (SAN) mechanism. The Let's Encrypt client automatically requests certificates for multiple names when requested to do so. The resulting certificates will be accepted by browsers for any of the domain names listed in them.

https://community.letsencrypt.org/t/frequently-asked-questions-faq/26

Option 4 sounds like the sanest (and easiest) to me too. apt and mirrors/ubuntu are different services really and might be split in the future (cf. T84817) so I'd honestly get two different certificates even.

BBlack renamed this task from enable https for (ubuntu|apt|mirrors).wikimedia.org to enable https for (carbon|ubuntu|apt|mirrors).wikimedia.org.Apr 15 2016, 11:16 AM
BBlack updated the task description. (Show Details)

Added carbon to the list, since that actually is the HTTP hostname we use for some of the access to this service (contents are the same as apt.wm.o though).

Change 283638 had a related patch set uploaded (by BBlack):
refactor install_server web stuff towards SSL config

https://gerrit.wikimedia.org/r/283638

Change 283639 had a related patch set uploaded (by BBlack):
mirrors::serve: split mirrors/ubuntu site configs

https://gerrit.wikimedia.org/r/283639

Change 283638 merged by BBlack:
refactor install_server web stuff towards SSL config

https://gerrit.wikimedia.org/r/283638

Change 283639 merged by BBlack:
mirrors::serve: split mirrors/ubuntu site configs

https://gerrit.wikimedia.org/r/283639

BBlack renamed this task from enable https for (carbon|ubuntu|apt|mirrors).wikimedia.org to enable https for (ubuntu|apt|mirrors).wikimedia.org.Apr 15 2016, 1:22 PM
BBlack updated the task description. (Show Details)

Change 283658 had a related patch set uploaded (by BBlack):
add dhparam to install web_server

https://gerrit.wikimedia.org/r/283658

Change 283659 had a related patch set uploaded (by BBlack):
add LE cert nginx config for carbon

https://gerrit.wikimedia.org/r/283659

Change 283658 merged by BBlack:
add dhparam to install web_server

https://gerrit.wikimedia.org/r/283658

Change 283659 merged by BBlack:
add LE cert nginx config for carbon

https://gerrit.wikimedia.org/r/283659

So, carbon now has working certs for apt, mirrors, and ubuntu, from Letsencrypt. I ran the cert generation manually, and that part's not puppetized or managed properly yet, which needs to happen now. Also, since this machine is precise, there's no LE client available via package management (ewwww), so I used the auto client from github. Note that carbon is due for jessie (which has an LE client) reinstall soon anyways, which will clean up some of this mess.

FTR, the un-puppetized things I've done on carbon are, essentially:

  1. git clone https://github.com/letsencrypt/letsencrypt in my homedir
  2. run the ./letsencrypt-auto client in there and let it install some new ubuntu dependency packages on carbon (ewwww)
  3. commented out 3 lines of the auto client to let the above work correctly (don't try to install "virtualenv", just "python-virtualenv")
  4. Generated my 3x certs with these commands:
    • ./letsencrypt-auto certonly --webroot --webroot-path /srv --agree-tos -q -d apt.wikimedia.org
    • ./letsencrypt-auto certonly --webroot --webroot-path /srv/mirrors --agree-tos -q -d mirrors.wikimedia.org
    • ./letsencrypt-auto certonly --webroot --webroot-path /srv/mirrors --agree-tos -q -d ubuntu.wikimedia.org

The above populate /etc/letsencrypt with a bunch of things, some of which probably need to be backed up securely. In theory so long as that directory is in-tact, we can do a daily cronjob for automated cert renewal automagically (the certs have 90-day life, and the cron will start trying to renew them when there's 30 days left).

TODO: puppetizing all of this to the degree possible, and then perhaps spreading this idea to some of our other one-off servers so we don't have to buy individual certs for them.

@faidon noted on IRC https://github.com/diafygi/acme-tiny might be a better client option, and is debianized already for stretch+

I'd like to see it puppetised for T97593#2115226

I'd like to see it puppetised for T97593#2115226

Yeah me too for a lot of things, but labs will probably be slightly-trickier than carbon is. Probably the main issue will be doing the auth step, since there's no "webroot" to go stuff a file in and have it appear publicly, and the DNS channel would be a nightmare, too. We could maybe do --standalone and stop nginx/varnish when issuing/renewing certs, though.

In any case, the situation here on carbon is ideal low-hanging fruit - it's about the simplest case anywhere in our infrastructure to puppetize LE for. Once we get over this hurdle, it should get progressively easier to spread the LE infection elsewhere.

there's no "webroot" to go stuff a file in and have it appear publicly

Varnish should be able to handle that

and the DNS channel would be a nightmare

I'm vaguely aware that Let's Encrypt allows some sort of auth via DNS entries? I don't think that could really work in labs without exposing the OpenStack APIs to the instances themselves and providing credentials (which seems like a bad idea)

We could maybe do --standalone and stop nginx/varnish when issuing/renewing certs, though.

Ewww :(

In any case, the situation here on carbon is ideal low-hanging fruit - it's about the simplest case anywhere in our infrastructure to puppetize LE for.

great :)

there's no "webroot" to go stuff a file in and have it appear publicly

Varnish should be able to handle that

But possibly only with the help of an extra http server or nginx?

I have a setup where I tell Varnish to redirect all /.well-known/acme-challenge traffic to one backend server (practically any server with a webserver should work), so I can use this (acme-tiny).

The official Let's Encrypt client has been a pain in the ass for me and acme-tiny exactly does what I need (verification without stopping the webserver), in just one python script with less than 200 lines.

If you can tell Varnish to pass LE traffic to a server with acme-tiny on it, then this shouldn't be too hard to do I guess?

akosiaris triaged this task as Medium priority.Apr 20 2016, 11:17 AM

Change 285196 had a related patch set uploaded (by BBlack):
apt|mirrors|ubuntu: puppetized LE certs

https://gerrit.wikimedia.org/r/285196

Change 285196 merged by BBlack:
apt|mirrors|ubuntu: puppetized LE certs

https://gerrit.wikimedia.org/r/285196

BBlack claimed this task.

@mark As (ubuntu|mirrors).wikimedia.org now supports HTTPS, could we update Wikimedia's Ubuntu mirror link to https://ubuntu.wikimedia.org/ubuntu/ on Ubuntu's website?

Let's also update the description while we're at it :-) "Wikimedia's Ubuntu Archive mirror in Tampa, Florida"

I'm not convinced https for that is a good idea. apt doesn't support it by default — apt-transport-https isn't installed out of the box even in Ubuntu AFAIK.

(In any case, ubuntu.wikimedia.org is deprecated — we should use http://mirrors.wikimedia.org/ubuntu/)

I'm not convinced https for that is a good idea. apt doesn't support it by default — apt-transport-https isn't installed out of the box even in Ubuntu AFAIK.

Citation needed, because what I find is:

apt-get (and other package manipulation commands, which are a front-end to the same APT libraries) can use HTTP, HTTPS and FTP (and mounted filesystems).

https://askubuntu.com/questions/146108/how-to-use-https-with-apt-get

I just tried on a new install of Ubuntu 12.04.5 Desktop, and apt-transport-https is installed out of box.

apt-transport-https:
  Installed: 0.8.16~exp12ubuntu10.17
  Candidate: 0.8.16~exp12ubuntu10.26

For what it's worth, when trying to switch our mirror on Launchpad to https:

The URI scheme "https" is not allowed. Only URIs with the following schemes may be used: http

I don't have the bandwidth to raise this with Ubuntu, but if anyone (@Chmarkine?) does, feel free to. LP #1464064 looks like a good place to start.