Page MenuHomePhabricator

Detail codfw snapshot/dataset requirements
Closed, DuplicatePublic

Description

This hardware-request will detail out the requirements for the dataset and snapshot cluster in codfw. This is a migration of ticket https://rt.wikimedia.org/Ticket/Display.html?id=7612. That old ticket has outdated specification details (That I'll paste in below).

Once the determination on what kind of hardware is needed for each system, we'll update this task summary with those details.

Event Timeline

RobH claimed this task.
RobH raised the priority of this task from to Medium.
RobH updated the task description. (Show Details)
RobH added a project: hardware-requests.
RobH added subscribers: RobH, ArielGlenn.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Below is a copy from my old email entry in the RT ticket:

We need to replace the snapshot and dataset infrastructure from Tampa for CODFW. All the hardware was out of warranty and thus either sold off/decommissioned.

I've listed off the following servers so Ariel can comment on what each is used for and any relevant upgrades that would help. I've listed the names of the ones in EQIAD, but we'll just be using the same names in ULSFO with the 2000 range of numerals.

dataset1001
single server
dual Intel Xeon(R) CPU X5650 @ 2.67GHz
164GB
12 2TB disks
md1200 array
12 2TB disks
Upgrade note: I'd have our system vendor update the disks to possibly 3TB disks. Also upgrade the CPU and memory to whatever the highest amount for a good price point rests at.

snapshot1001
R815
Quad cpu amd 6134 2.3 opteron
64GB memory
Upgrade Notes: move to newer platform, keep 4 cpu, up memory per cpu if possible for price point

snapshot1002
R410
dual Intel(R) Xeon(R) CPU E5645 @ 2.40GHz
12GB memory
upgrade note: this should just change to use our high performance misc server spec, dual cpu 8 cores each 64GB RAM dual sata disks for OS.

snapshot1003
R410
same as snapshot1002

snapshot1004
R410
same as snapshot1002

Ariel: If this seems correct to you, please reply with such. I'll get the different specs quoted out.

Assigning to Ariel for overall input and detailing of the requirements for a snapshot/dataset cluster host in codfw.

Keep in mind we'll have to order new, and we'll update the specification above. (Since they are old, memory will be cheaper and thus likely bump upwards, etc...)

@ArielGlenn: Please detail how many snapshot and dataset hosts needed for codfw to replicate eqiad. Also note if snapshot/dataset hosts are planned to be duplicated in codfw. (If they are not, this task can be declined.)

Once I have basic details on what is needed, I'll link sub-tasks off this in procurement for quoting.

It's not going to be straight up duplication. There are two things at play here:

  1. I want to get rid of nfs when we deploy in codfw. If this seems like it's too hard to do soon or that it should wait for Dumps 2.0 (tm) which obviously won't happen overnight, then we can skip this for now but have it in the back of our minds.
  2. We need a little more capacity in eqiad (see ticket: T118154) and so I'd like to duplicate the setup when we have the added capacity. Oh and I Just remembered,
  3. We have a spare dataset (ms1001) in eqiad; obviously we do not need two datasets in codfw.
ArielGlenn set Security to None.
ArielGlenn moved this task from Backlog to Active on the Dumps-Generation board.

So @RobH can you have a look at the eqiad hw ticket and let's hash that out first? Then we can use that as the basis for hw in codfw with whatever adjustments.