Page MenuHomePhabricator

add ms-be1019 / 1020 / 1021 to swift
Closed, ResolvedPublic

Description

new 4TB HP machines have been installed, next step is add them to swift ring and expand capacity

Event Timeline

fgiunchedi claimed this task.
fgiunchedi raised the priority of this task from to Normal.
fgiunchedi updated the task description. (Show Details)
fgiunchedi added a subscriber: fgiunchedi.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 9 2015, 3:13 PM

see also the final allocation plan originally at https://phabricator.wikimedia.org/T114711#1705505

rowallocationzonestotal
A3x2TB + 3x2TB212TB
B3x3TB19TB
C3x2TB + 3x2TB212TB
D3x3TB19TB

In that case we would allocate 1x in each B and D, plus 1x in e.g A (all 4TB machines). Later, move 1x 2TB machine from A to C and consolidate existing machines into a single zone per row

rowallocationzonestotal
A3x2TB + 1x4TB + 2x2TB114TB
B3x3TB + 1x4TB113TB
C3x2TB + 3x2TB + 1x2TB114TB
D3x3TB + 1x4TB113TB
fgiunchedi added a comment.EditedNov 20 2015, 4:34 PM

ms-be1019 / 1020 / 1021 have been set to weight 3000, once that is fully rebalanced the following needs to happen:

  • move 1003 / 1004 from zone 2 to zone 1
  • move 1012 from zone 2 to zone 3 (also move from row A to row C)
  • move 1009 / 1010 / 1011 from zone 4 to zone 3

this will leave zone 2 and zone 4 empty, yielding

rowzonemachinesspecsraw capacityspindles
A1100[12348]/10195x2TB + 1x4TB120TB + 48TB72
B61016 / 1017 / 1018 / 10203x3TB + 1x4TB108TB + 48TB48
C310[5679]/101[012]7x2TB168TB84
D5101[345]/10213x3TB + 1x4TB108TB + 48TB48

note the different spindle count in zones with 2TB machines (ms-be1001 -> ms-be1012) those are also the oldest and out of warranty in two/three weeks time, so eventually we want to phase those out

I've been testing ms-be1019 with weight 4000 for the last week or so, the rebalance is almost complete so things should be settling down

average wait, top 10

https://graphite.wikimedia.org/render/?width=578&height=379&_salt=1454928864.587&from=-30days&target=highestAverage(servers.ms-be1019.iostat.sd*.await%2C10)

util %, top 10

https://graphite.wikimedia.org/render/?width=586&height=308&_salt=1454515814.303&target=highestAverage(servers.ms-be1019.iostat.sd*.util_percentage%2C5)&from=-30days

also note the spikes even before the bump weight is swift auditing objects on disk

fgiunchedi closed this task as Resolved.Mar 15 2016, 2:19 PM

machines are in service, for weight / rack-zone allocation / etc see T130012: expand swift hardware in codfw/eqiad