Page MenuHomePhabricator

New package builder host
Closed, ResolvedPublic

Description

Site/Location: currently eqiad, but just as well be in codfw

copper currently has 8 GB of RAM, together with its one GB of swap, this is no longer enough to build HHVM for stretch (it still worked for jessie, but likely the linker/G++ requires more memory with GGC 6.3). We could work around this with more swap space, but that's not really an ideal solution...

If we have available RAM modules for that type of server, extending the existing RAM is good enough, the system is otherwise okay. Ideally 32 GB of RAM to have some headspace, but 16 GB is fine as well.

If we don't have spare RAM, we could use a spare system (this seems fine for a spare, since the service is ops-only and not critical) or procure a new one.

There are no particular CPU requirements, even the current 8 cores on copper are good enough since any decent build systems allows good parallelisation. Like copper, the system should continue to use SSDs, though (or we simply move the copper disks to the replacement host).

Event Timeline

This seems like it doesn't need much space on the disks, the smallest spare eqiad system I have that meets the other requirements (32GB RAM), we have a few options.

We have an older spare WMF4727, purchased on 2015-12-05. Its warranty expires on 2018-12-05, so it would be nice to actually put it into use! It is a Dell PoweEdge R430, Dual Intel® Xeon® Processor E5- 2623 V3 (3Ghz/4core), 32GB ram and (4) 4TB SATA HDD.

If that one doesn't work, we have a newer spare wmf4749, hw warranty expiry is 2019-03-24, Dell PoweEdge R430, Dual Intel Xeon E5-2640 v3 (2.6/8), 64GB RAM and (2) 1 TB SATA.

in codfw, we have WMF6469, Warranty expiry 2019-11-03, Dell PowerEdge R430, Dual Intel® Xeon® Processor E5-2623 (2.6/4), 32GB RAM, (4) 4TB SATA

wmf6407, warranty expiry 2019-03-24, Dell PowerEdge R430, Dual Intel® Xeon® Processor E5-2640 (2.6/4), 64GB RAM, (2) 1TB

Ideally, we try to use our OLDER spares first.

@MoritzMuehlenhoff: Do you have a preference for any single host listed above? Once you select one, we'll still need to get either @mark or @faidon to approve the allocation.

WMF4727 sounds like a pretty good fit (if we can swap copper's SSD drives in there (since they currently have SATA)?)

Copper is a very old R310, which has cabled HDD with LFF bays. The SFF SDDs fit in, since it is a non-hot-swap chassis. If we want to move the old SSDs from copper into the new host, it will need to be a host with SFF drive bays.
The other eqiad spare listed has sff bays, so adding the SSDs becomes feasible again: spare wmf4749, hw warranty expiry is 2019-03-24, Dell PoweEdge R430, Dual Intel Xeon E5-2640 v3 (2.6/8), 64GB RAM and (2) 1 TB SATA.

So I'd recommend wmf4749 if we want to move the SSDs (which couild be near end of life if copper was heavily used) into the new system. (If they fail, we can always fail back to the new systems SATA disks.)

Or maybe let's go ahead with the SATAs as currently used in WMF4727 (which still is a much faster system than copper). Package building isn't the most I/O bound task we're running and if it turns out to be much slower we can still buy SSDs for WMF4727 later on? Adding @akosiaris and @fgiunchedi for comments, since they participated in the discussion in T130759

IIRC we opened T130759 because slow IO had indeed cause some minor suffering on our part. If we can avoid migrating back to SATA disks easily I think we should. There's one more option on the table btw. A ganeti VM. We currently have the CPUs, the space, the IOPS and the memory in eqiad to support this and the spikey nature of package building fits rather ok with the virtualization idea.

My (mild) concern against a Ganeti VM is that some packages might build differently if they detect virtualisation (via systemd-detect-virt or whatever). Not sure if that's an issue in practice, though.

IIRC we opened T130759 because slow IO had indeed cause some minor suffering on our part. If we can avoid migrating back to SATA disks easily I think we should. There's one more option on the table btw. A ganeti VM. We currently have the CPUs, the space, the IOPS and the memory in eqiad to support this and the spikey nature of package building fits rather ok with the virtualization idea.

Agreed on not moving away from SSDs! It'd make a single build slow and concurrent builds very slow again.
Re: virtualization I don't feel strongly either way, worth a try though as having a ganeti VM would be quick and easy.

Let's try a Ganeti VM, then. Any objections? If that turns out to be non-ideal, we can still revisit WMF4727 (and buy SSDs on top).

akosiaris claimed this task.

Agreed. See T176607

I 've resolve this one for now and we can always reopen