Page MenuHomePhabricator

EQIAD: stat1003 replacement
Closed, ResolvedPublic

Description

Site/Location:EQIAD
Number of systems: 1
Service: stat*
Networking Requirements: internal IP, in Analytics VLAN
Processor Requirements: 16+cores
Memory: 64G
Disks: 4 x 3+TB software RAID 5, total 4T+ capacity.
Partitioning Scheme: 30G /, 100G /tmp, rest of space for /srv (I plan on putting /home in the /srv partition (via symlink?))

This ticket tracks replacement of the OOW stat1003.

Event Timeline

Ottomata triaged this task as Medium priority.Mar 7 2017, 3:40 PM
Ottomata created this task.
Ottomata updated the task description. (Show Details)
Ottomata added subscribers: elukey, RobH.
Nuria raised the priority of this task from Medium to High.Mar 13 2017, 3:51 PM
Nuria moved this task from Incoming to Wikistats on the Analytics board.

Raid5 is being used in production on these boxes? That seems, non-ideal....

I'll start pulling together a quotes for this shortly. While this system has the exact same requirements on T159838, it lacks the GPU requirement. So this can be met in a 1U chassis for much cheaper than the GPU enabled system on T159838.

We want a RAID that gives us the most space with a little bit of redundancy. Is RAID 5 not the best choice?

Raid5 has very slow write (same as raid6), due to the calculations on the parity striping (redundancy) across the disks. The fastest raid for writes (not counting 0) is raid10. (Which is really a raid0+raid1.

I'd advise a full review if raid5 is being used, and if it is fast enough for the use case. For the most part, we have completely eliminated using raid5 in production, so I wanted to point this out.

The only raid level I feel comfortable recommending for production use is raid10.

There aren’t any ‘services’ hosted on these nodes. The drives are only
used for local data storage, so we aren’t really concerned with write io
performance. We’re more interested in having enough space. If we can get
the space we need with RAID 10, then either is fine with me :)

This comment was removed by Halfak.

@Ottomata: So we've discussed the raid level, but I realize now I never got the overall capacity requirement? Not the disk layout, but the overall minimum needed of post raid storage space.

Assigned back to you for feedback, please provide and assign back to me for quotation.

stat1003 is using about 3.2T space right now, and I don't expect it to grow much. If we can get something with at least 4T storage capacity, 6T better, we'd be good.

Ok, we can do 4 * 4TB to hit 8TB in raid10. That comes out to more like 7.4TB usable.

I'll also ask for quotes for 4 * 6TB to see what the price difference is.

RobH created subtask Unknown Object (Task).Mar 24 2017, 5:08 PM
Cmjohnson closed subtask Unknown Object (Task) as Resolved.May 9 2017, 2:57 PM

This is ordered and being received in on linked procurement task, as well as setup on task T165366. As such, this hardware-requests is resolved.