Page MenuHomePhabricator

Setting up a mirror serv{er,ice}
Closed, ResolvedPublic

Description

Right now we have the server "carbon" in eqiad to serve as what we call an install server: a DHCP server, a TFTP server, a Squid (for webproxy), our own apt repository (reprepro) and an Ubuntu mirror.

tftpboot consumes 144M, reprepro consumes 6G, squid's spool 1.9G and the Ubuntu mirror 761G. Bandwidth-wise, we we pushing 200-250Mbps, but we're down to less than 10Mbps since the switch from lighttpd to nginx, presumably due to If-Modified-Since support. I/O-wise, the workload is very read intensive and has a lot of hot files, so pagecache helps to keep IOPS really low.
If we add Debian into the mix (we don't strictly need to), this will be another ~400G in size for i386/amd64 (https://www.debian.org/mirror/size). carbon has the capacity for this -- it currently has 1.1T free.

However, I think it's a bit of a pity for us to spend 1-1.5T in disk space and have them sitting idle. Therefore, I propose that we a) split the mirror server from the install server(s) b) publically advertise the install server as an official Debian/Ubuntu mirror. (a) is not strictly a prerequisite for (b), but it's my view that it would be better to split those two roles, as i) we frequently tinker with carbon (disable puppet and do manual hacks while troubleshooting, ii) a mirror would require more software to run, like FTP and rsync and possibly SSH (for push mirroring), and we shouldn't expose our apt repository/install server to a larger attack vector.

Since carbon has the necessary space, this is would actually entail procuring a new *install* server. The requirements for this should be really tiny, a small misc server would probably do it. Note that we already have install2001 in codfw and it seems overprovisioned for an install server and underprovisioned for a mirror (it doesn't even fit Ubuntu).

Moreover, this would also entail "paying" for the additional resources an official mirror would need compared to an unofficial one like we have now. For Debian, I asked around and even big country mirrors have < 100Mbps in traffic, which I think wouldn't be a problem for us. Ubuntu may be more popular but primary mirrors are Canonical's, so we'd only be a secondary mirror and one out of many, so I don't expect much usage there either.

Details

Reference
rt9108

Event Timeline

rtimport raised the priority of this task from to Medium.Dec 18 2014, 2:20 AM
rtimport added a project: ops-core.
rtimport set Reference to rt9108.
faidon set Security to None.

I think it is a good idea to split distribution mirroring from install-server, also keeping carbon as a public mirror for debian/ubuntu would be a good service so I'm in favor

Our install server is already a public Ubuntu mirror, or at least it was before it got merged to the current server. ubuntu.wikimedia.org, was listed in Launchpad in the mirror list.

faidon lowered the priority of this task from Medium to Low.Dec 18 2014, 4:59 PM
faidon updated the task description. (Show Details)
faidon changed the visibility from "WMF-NDA (Project)" to "Public (No Login Required)".
faidon changed the edit policy from "WMF-NDA (Project)" to "All Users".

I +1 both ideas, namely splitting the install-server from the mirror and setting up a debian mirror.

I moved the splitting stuff to a different task, T132757 and put it as a blocker to this task. As far as mirrors go:

  • As @mark mentioned, we are already an Ubuntu official mirror. That's good to know and great to see that we're aligned :) We should probably figure out (and document :) the process for updating that hostname to mirrors.wikimedia.org/ubuntu and sunset ubuntu.wikimedia.org soon. I'm not sure if it's possible to become a push mirror (I guess so), we should explore that too.
  • We already mirror the Debian archive for i386/amd64 right now and for quite some time now. We are not an official mirror though and we are not a push mirror either (this needs rsync etc., so it's better if we wait until we ditch everything sensitive out of carbon). We also should explore if it's a requirement (and for how much space) to mirror all architectures.
RobH mentioned this in Unknown Object (Task).Jun 6 2016, 6:24 PM

The traditional "installserver role" that did everything is gone since today. i split it into "dhcp", "http", "preseed", "proxy" and "tftp" all in modules/role/manifests/installserver/ that should all be freely movable around nodes because they include 'standard' and 'base::firewall' individually. changes are linked in subtask.

node 'carbon.wikimedia.org' {
    role(installserver::tftp,
        installserver::dhcp,
        installserver::http,
        installserver::proxy,
        installserver::preseed,
        aptrepo::wikimedia)

Meanwhile we have sodium.wikimedia.org which is mirrors.wikimedia.org and has all the files, while carbon is not it anymore (and has 14T instead of 21T or something in data left in mirrors).

Did that resolve this ticket, i am now wondering.

@faidon Is this ticket resolved since we have sodium assigned as mirrors.wm.org or should carbon still become a mirror server now that it's not an install server anymore?

faidon claimed this task.

Nope, done for a while now :)