Page MenuHomePhabricator

Two test hosts for SREs
Closed, ResolvedPublic

Description

Until a while ago multatuli.wikimedia.org used to be a SRE test host for a number of tests/tasks that can't be done well in Cloud VPS, e.g. testing custom kernels or tests of lower level changes before they get applied to production. multatuli got repurposed as an auth DNS server and generally speaking it would be good to have not only server for those kind of tests, but to (one running Debian stable and the other running Debian oldstable).

We don't need to buy a new machine for these, a spare server should be perfectly fine. There are also no requirements hardware-wise, the smallest setup available is fine. And they can be in any DC, it's also fine to have one server in eqiad and the other in codfw, e.g.

(These hosts are no sensible candidates for a Ganeti instance, e.g. the tests for the microcode tests for Spectre/Meltdown would not work on a KVM instance).

Event Timeline

CDanis triaged this task as Medium priority.Jan 17 2019, 4:50 PM
RobH moved this task from Backlog to Pending Approval on the hardware-requests board.
RobH added subscribers: faidon, RobH.

So we are down to just one single cpu spare misc host. I'm creating a task to order more spare servers, but for now I can only allocate 1 system for this.

wmf7622 in eqiad: Single Intel Xeon Silver 4110 (2.10GHz/8C), 32GB, (2) Intel S4600 240GB

@faidon: Can you approve this allocation of our last single cpu spare pool system in eqiad? I'll also create a procurement task for buying more single cpu misc systems (we're already doing this for dual cpu misc, so this is easy to do as well.)

Once this is approved, assign back to me and I'll get it allocated and spun up, then stall this task until a second single cpu misc system arrives for approval for the second of the two systems.

Please note T216269 tracks the order of new single cpu spare pool systems. Once we have those ordered, a second system can be allocated via this task.

RobH added a subtask: Unknown Object (Task).Feb 15 2019, 6:43 PM
RobH changed the status of subtask Unknown Object (Task) from Stalled to Open.Apr 2 2019, 3:48 PM
Cmjohnson closed subtask Unknown Object (Task) as Resolved.Apr 24 2019, 3:15 PM

I don't know what the status of this is, it's been a while it seems. I see it was pending for my approval, which I've missed -- apologies! Approved now.

Ok, wmf5175 was ordered and can be allocated as the dual cpu spare pool system currently available in eqiad.

Current proposal:

Allocate these two single cpu misc hosts:

wmf7622
wmf5175

I suspect this approval to be trouble-free, but please let me know if more info is needed! (Once approved, I'll make a setup task for each system.)

I had missed the followup. sorry. These two spare hosts would be fine as test hosts!

This is still pending mgmt approval for allocation of these two spares:

Ok, wmf5175 was ordered and can be allocated as the dual cpu spare pool system currently available in eqiad.

Current proposal:

Allocate these two single cpu misc hosts:

wmf7622
wmf5175

I suspect this approval to be trouble-free, but please let me know if more info is needed! (Once approved, I'll make a setup task for each system.)

FYI, I 'll also piggybacking some k8s tests on these hosts as my local env doesn't have enough memory anymore

OK, it sounds like @akosiaris and @MoritzMuehlenhoff have coordinated with each other and they can share those two hosts as SRE test hosts.

This allocation is approved. @RobH, please proceed with the rest of the steps for this. Thanks!

These will be setup via T245754. Resolving this allocation task.