Page MenuHomePhabricator

eqiad: (2) servers request for ORES
Closed, ResolvedPublic

Description

Labs Project Tested: n/a
Site/Location: EQIAD
Number of systems: 2
Service: ores
Networking Requirements: internal
Processor Requirements: https://tools.wmflabs.org/nagf/?project=ores#h_ores-redis-02_cpu points out that we are probably OK with a x3450.
Memory: 8G
Disks: > 20Gb
NIC(s): 1x1Gbps
Partitioning Scheme: unimportant
Other Requirements:

Event Timeline

akosiaris raised the priority of this task from to Medium.
akosiaris updated the task description. (Show Details)
akosiaris added projects: hardware-requests, SRE.
akosiaris subscribed.

Looking into the server spares I see:

WMF3149 and WMF3300, 2 Dell PowerEdge R310, Single Intel Xeon X3450, 8GB Memory (2) 500GB 3.5 SATA that look like they fit the bill pretty well. Unless someone objects I will acquire them for this task and start setting them up

RobH added a subscriber: mark.

@akosiaris: So using the old out of warranty systems still needs @mark to approve. Right now we are kind of in the process of killing off older systems that are out of warranty to make room for new ones, so we should clear it with him.

If @mark approves, that is fine. We'll need to do a few things if approved:

  • - create an onsite task in ops-eqiad to add the new hostname labels to each system.
  • - remove from the spares page on wikitech, commenting that the edit summary has this task # in it.
  • - all the other stuff for installing, but everyone tends to forget the first one since it cannot be done by anyone but the on-site.

@mark: Please review and approve/deny the allocation of two out of warranty spares WMF3149 and WMF3300. These expired back in 2014, but otherwise should function normally (we also have 7 of these spare so if one doesn't function another can.)

If approved, you can assign back to me for the above steps or Alex can steal if he gets to it first (and does the above tasks.)

Thanks!

RobH renamed this task from Site: 2 hardware access request for ORES to eqiad: (2) spare servers request for ORES.Dec 2 2015, 10:34 PM
RobH set Security to None.
RobH moved this task from Backlog to Pending Approval on the hardware-requests board.

I 've talked about this with @mark. He's against using those server spares and with good reason.

It was suggested to wait for a batch of new misc servers to arrive in eqiad and use a couple of those instead. These are tracked in a procurement ticket already so it will hopefully not add a lot of extra delay.

akosiaris renamed this task from eqiad: (2) spare servers request for ORES to eqiad: (2) servers request for ORES.Dec 3 2015, 12:22 PM
akosiaris added a subtask: Unknown Object (Task).
RobH edited subtasks, added: Unknown Object (Task); removed: Unknown Object (Task).Dec 3 2015, 5:46 PM
Cmjohnson mentioned this in Unknown Object (Task).Dec 15 2015, 4:56 PM

Ok, the cleanup of spares (via another task, they were migrated into a sheet for tracking and re-audited) has resulted in my finding a few potential systems for this. @akosiaris already stated that @mark has approved the allocation of newly aquired spares, as we cannot use out of warranty spares.

I've found some in warranty spares that are a LOT more closer to the specification than the brand new dual cpu 4 disk systems. The new systems have far more memory, cpu, and disk than requested on this task. I'm not sure if @akosiaris stating that @mark approved something counts as approval (so far the only person I know allowed to do that is @faidon ;) I could be mistaken, in either case I'll list off the potential spares, Alex can pick the one that best fits the need (rather than overprovision) and then @mark can approve. (This is not a question of trust, as I trust Alex equally as well as I trust my other team members, which is quite high!)

Once @mark approves, a few things need to happen:

  • This task needs to note which system was selected
  • The spares tracking sheet needs to have the row of the spare system allocated removed off the EQIAD Spare Servers tab.
  • This task needs to have as sub-task created in the SRE (not hardware-requests) for the setup of the system.
    • We went with ores100X, but Alex should approve and may want to use something different.
    • That task needs to have a sub-task allocated to @Cmjohnson so he can apply the new hostname labels to the allocated systems.

I've listed all of that so if @akosiaris doesn't want to wait for me to triage this on Thursday, he can do so immediately upon getting @mark's approval.

The options for this are as follows:

Existing spares:
WMF4577 & WMF4578: Identical systems, Dell PowerEdge R420, Single Intel® Xeon® Processor E5-2450 (your spec requests a CPU of 4 cores @ 2.66, this is 8 cores @ 2.1GHz), 16GB RAM Dual 500GB SATA. The warranty on these systems is through 2017-04-30, so over a year.

Brand new dual cpu 4*4tb SATA systems:

WMF4721 & WMF4723: These are part of the order of 8 brand new misc systems. They are Dell PoweEdge R430, Dual Intel® Xeon® Processor E5- 2623 V3 (So that is 3.0 @ 4 cores, dual cpu), 32GB RAM, 4*4TB SATA. While we could use these, they seem largely overprovisioned.

However, I don't want to make the decision for you, and I had incorrectly advised we had no in warranty spares for this previously. Please review the two options above and ensure @mark approves on this task. Then you can assign to either @Cmjohnson or myself for implementation of the above, or move ahead on your own. (I've only listed all the options/steps so I don't block you on this; I have no problem if you and mark comment and assign back to me for implementation.)

Ok, the cleanup of spares (via another task, they were migrated into a sheet for tracking and re-audited) has resulted in my finding a few potential systems for this. @akosiaris already stated that @mark has approved the allocation of newly aquired spares, as we cannot use out of warranty spares.

Actually no I haven't said explicitly that @mark has approved the allocation of newly acquired spares. I 've clearly said it was suggested. Which is a different thing.

I've found some in warranty spares that are a LOT more closer to the specification than the brand new dual cpu 4 disk systems. The new systems have far more memory, cpu, and disk than requested on this task. I'm not sure if @akosiaris stating that @mark approved something counts as approval (so far the only person I know allowed to do that is @faidon ;) I could be mistaken, in either case I'll list off the potential spares, Alex can pick the one that best fits the need (rather than overprovision) and then @mark can approve. (This is not a question of trust, as I trust Alex equally as well as I trust my other team members, which is quite high!)

As I already said above, I did not state that @mark approved anything. Nor will I, ever.

And obviously whatever decision we take on this subject will need to be approved by @mark.

[snip]

The options for this are as follows:

Existing spares:
WMF4577 & WMF4578: Identical systems, Dell PowerEdge R420, Single Intel® Xeon® Processor E5-2450 (your spec requests a CPU of 4 cores @ 2.66, this is 8 cores @ 2.1GHz), 16GB RAM Dual 500GB SATA. The warranty on these systems is through 2017-04-30, so over a year.

These look great. I think we can move on with those.

However, I don't want to make the decision for you, and I had incorrectly advised we had no in warranty spares for this previously. Please review the two options above and ensure @mark approves on this task. Then you can assign to either @Cmjohnson or myself for implementation of the above, or move ahead on your own. (I've only listed all the options/steps so I don't block you on this; I have no problem if you and mark comment and assign back to me for implementation.)

OK, no worries. @mark can you please approve moving on with WMF457{7,8} ? Thanks!

  • We went with ores100X, but Alex should approve and may want to use something different.

These aren't machines for running ORES itself but for the ORES Redis, so should be oresrdb1xxx?

Just to summarize, this task is now assigned to @mark and awaits his approval for allocation of:

WMF4577 & WMF4578: Identical systems, Dell PowerEdge R420, Single Intel® Xeon® Processor E5-2450 (your spec requests a CPU of 4 cores @ 2.66, this is 8 cores @ 2.1GHz), 16GB RAM Dual 500GB SATA. The warranty on these systems is through 2017-04-30, so over a year.

Once approved, I'll get these up and working for Alex. (So please assign this task back to me post-approval.)

@yuvipanda: Your reasoning seems sound to me, we'll call these oresdb1xxx.

I would suggest oresredis1xxx since they'll be used both as queue and cache machines, no 'db' type things there (we can flush them whenever)

I would suggest oresredis1xxx since they'll be used both as queue and cache machines, no 'db' type things there (we can flush them whenever)

Sounds (mostly) good to me, I was only suggesting oresdb since it was your previous suggestion. We call other redis boxes rdb, so perhaps oresrdb1xxx?

Edit update: I misread Yuvi's initial suggestion of oresrdb. I read oresdb...

So oresrdb1xxx

T125562 now exists for the setup of these systems. This hardware-requests is completed.

RobH closed subtask Unknown Object (Task) as Resolved.Jul 11 2016, 5:29 PM