Page MenuHomePhabricator

Put ms-be2057 (Dell R740xd2) in service
Open, HighPublic

Description

We got ms-be2057 installed in codfw as a try and buy host in T252216 sporting 24x 8TB disks. This task will track putting the host in service in swift.

  • Run benchmarks / stress tests
  • Instruct puppet and swift about all the hosts' disks
  • Gradually put weight onto the disks in swift

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMon, Aug 31, 12:24 PM
fgiunchedi triaged this task as High priority.Mon, Aug 31, 12:55 PM

In terms of disk benchmarks, I've ran an initial ~1h stress test with fio running a mix of random read/writes and sequential reads and writes. The idea being to gauge what the performance is under normal load (i.e. 200 read/s + 10 write/s on average, with both sites active). See also fio configuration below and initial results, so far so good I think!

[global]
directory=/srv/swift-storage/sdc1:/srv/swift-storage/sdd1:/srv/swift-storage/sde1:/srv/swift-storage/sdf1:/srv/swift-storage/sdg1:/srv/swift-storage/sdh1:/srv/swift-storage/sdi1:/srv/swift-storage/sdj1:/srv/swift-storage/sdk1:/srv/swift-storage/sdl1:/srv/swift-storage/sdm1:/srv/swift-storage/sdn1:/srv/swift-storage/sdo1:/srv/swift-storage/sdp1:/srv/swift-storage/sdq1:/srv/swift-storage/sdr1:/srv/swift-storage/sds1:/srv/swift-storage/sdt1:/srv/swift-storage/sdu1:/srv/swift-storage/sdv1:/srv/swift-storage/sdw1:/srv/swift-storage/sdx1:/srv/swift-storage/sdy1:/srv/swift-storage/sdz1
numjobs=24

runtime=20m
time_based

direct=1
buffered=0
invalidate=1
sync=1
gtod_reduce=1
ioengine=libaio

group_reporting

[mixed]
wait_for_previous
readwrite=rw
rwmixread=90
rwmixwrite=10
blocksize_range=2k-24k
# avg 200 read/s, 10 writes/s
#rate_iops=200,10
nrfiles=1024
size=1g

[write_seq]
wait_for_previous
fill_fs=1
readwrite=write
blocksize_range=2k-24k

[read_seq]
wait_for_previous
readwrite=read
blocksize_range=2k-24k
nrfiles=1024
size=1g
mixed: (groupid=0, jobs=24): err= 0: pid=75527: Mon Aug 31 13:53:37 2020
  read : io=51813MB, bw=44195KB/s, iops=4015, runt=1200513msec
  write: io=5828.3MB, bw=4971.4KB/s, iops=445, runt=1200513msec
  cpu          : usr=0.21%, sys=0.81%, ctx=5367423, majf=0, minf=4216
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=4820778/w=534694/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
write_seq: (groupid=1, jobs=24): err= 0: pid=79155: Mon Aug 31 13:53:37 2020
  write: io=481513MB, bw=410884KB/s, iops=36790, runt=1200021msec
  cpu          : usr=0.70%, sys=6.34%, ctx=44388217, majf=0, minf=235
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=44149136/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
read_seq: (groupid=2, jobs=24): err= 0: pid=80860: Mon Aug 31 13:53:37 2020
  read : io=65431MB, bw=55820KB/s, iops=5063, runt=1200304msec
  cpu          : usr=0.23%, sys=0.85%, ctx=6088121, majf=0, minf=4433
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=6077328/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=51813MB, aggrb=44194KB/s, minb=44194KB/s, maxb=44194KB/s, mint=1200513msec, maxt=1200513msec
  WRITE: io=5828.3MB, aggrb=4971KB/s, minb=4971KB/s, maxb=4971KB/s, mint=1200513msec, maxt=1200513msec

Run status group 1 (all jobs):
  WRITE: io=481513MB, aggrb=410883KB/s, minb=410883KB/s, maxb=410883KB/s, mint=1200021msec, maxt=1200021msec

Run status group 2 (all jobs):
   READ: io=65431MB, aggrb=55819KB/s, minb=55819KB/s, maxb=55819KB/s, mint=1200304msec, maxt=1200304msec

Disk stats (read/write):
  sdc: ios=453596/3723862, merge=0/646, ticks=2372768/1054456, in_queue=3425900, util=95.16%
  sdz: ios=458736/3728730, merge=0/641, ticks=2374200/1051812, in_queue=3424384, util=95.12%
  sdy: ios=450430/3726065, merge=0/648, ticks=2374188/1053976, in_queue=3426748, util=95.21%
  sdx: ios=452418/3728557, merge=0/645, ticks=2373332/1055516, in_queue=3427352, util=95.18%
  sdw: ios=453801/3728103, merge=0/643, ticks=2373460/1054064, in_queue=3425876, util=95.15%
  sdv: ios=460087/3729783, merge=0/642, ticks=2374996/1056028, in_queue=3429548, util=95.26%
  sdu: ios=448971/3721425, merge=0/643, ticks=2373708/1052504, in_queue=3424760, util=95.15%
  sdt: ios=455204/3721410, merge=0/645, ticks=2373304/1054228, in_queue=3426060, util=95.18%
  sds: ios=452556/3729685, merge=0/641, ticks=2373832/1055596, in_queue=3427916, util=95.24%
  sdr: ios=450861/3725071, merge=0/642, ticks=2372416/1054104, in_queue=3425088, util=95.17%
  sdq: ios=454720/3726790, merge=0/644, ticks=2373764/1055224, in_queue=3427600, util=95.24%
  sdp: ios=449628/3722767, merge=0/647, ticks=2373920/1052464, in_queue=3424788, util=95.18%
  sdo: ios=449816/3722164, merge=0/642, ticks=2373940/1054988, in_queue=3427456, util=95.25%
  sdn: ios=451874/3726767, merge=0/645, ticks=2372936/1055400, in_queue=3426792, util=95.23%
  sdm: ios=456908/3726871, merge=0/642, ticks=2374276/1054764, in_queue=3427588, util=95.27%
  sdl: ios=460253/3724302, merge=0/644, ticks=2373144/1052376, in_queue=3424156, util=95.22%
  sdk: ios=450482/3723451, merge=0/649, ticks=2373276/1053468, in_queue=3425408, util=95.25%
  sdj: ios=456316/3721363, merge=0/643, ticks=2373944/1053420, in_queue=3425788, util=95.27%
  sdi: ios=448797/3723548, merge=0/647, ticks=2374812/1051884, in_queue=3425148, util=95.24%
  sdh: ios=451348/3721887, merge=0/645, ticks=2373420/1054340, in_queue=3426188, util=95.26%
  sdg: ios=449863/3719943, merge=0/1160, ticks=2372480/1052492, in_queue=3423452, util=95.21%
  sdf: ios=449910/3720302, merge=0/1183, ticks=2374452/1052820, in_queue=3425764, util=95.27%
  sde: ios=455392/3718203, merge=0/1190, ticks=2374476/1051612, in_queue=3424540, util=95.22%
  sdd: ios=475679/3722121, merge=0/1183, ticks=2373172/1052504, in_queue=3424064, util=95.21%

With a fixed blocksize=64k for both sequential reads and writes:

write_seq: (groupid=1, jobs=24): err= 0: pid=227769: Tue Sep  1 08:40:17 2020
  write: io=1359.8GB, bw=1160.3MB/s, iops=18564, runt=1200044msec
  cpu          : usr=0.79%, sys=5.88%, ctx=22587725, majf=0, minf=4058
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=22277840/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
read_seq: (groupid=2, jobs=24): err= 0: pid=231326: Tue Sep  1 08:40:17 2020
  read : io=407741MB, bw=347878KB/s, iops=5435, runt=1200209msec
  cpu          : usr=0.22%, sys=0.97%, ctx=6535021, majf=0, minf=4646
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=6523856/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Run status group 1 (all jobs):
  WRITE: io=1359.8GB, aggrb=1160.3MB/s, minb=1160.3MB/s, maxb=1160.3MB/s, mint=1200044msec, maxt=1200044msec

Run status group 2 (all jobs):
   READ: io=407741MB, aggrb=347878KB/s, minb=347878KB/s, maxb=347878KB/s, mint=1200209msec, maxt=1200209msec

IOW ~18k write iops and ~1.1GB/s. For reads ~5.4k iops and ~350MB/s

fgiunchedi moved this task from Backlog to Doing on the User-fgiunchedi board.Tue, Sep 1, 12:47 PM
fgiunchedi updated the task description. (Show Details)Wed, Sep 2, 10:02 AM

Change 623769 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] swift: extend ferm rules to cover more ports

https://gerrit.wikimedia.org/r/623769

Change 623779 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] Add ms-be2057 to swift firewall

https://gerrit.wikimedia.org/r/623779

Change 623769 merged by Filippo Giunchedi:
[operations/puppet@production] swift: extend ferm rules to cover more ports

https://gerrit.wikimedia.org/r/623769

Change 623779 merged by Filippo Giunchedi:
[operations/puppet@production] Add ms-be2057 to swift firewall

https://gerrit.wikimedia.org/r/623779

Change 623966 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] statsd_exporter: stop tracking local statsd connections

https://gerrit.wikimedia.org/r/623966

fgiunchedi updated the task description. (Show Details)Thu, Sep 3, 12:39 PM

Script wmf-auto-reimage was launched by filippo on cumin1001.eqiad.wmnet for hosts:

ms-be2057.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009031240_filippo_20027_ms-be2057_codfw_wmnet.log.

Completed auto-reimage of hosts:

['ms-be2057.codfw.wmnet']

Of which those FAILED:

['ms-be2057.codfw.wmnet']

Change 625604 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/software/swift-ring@master] codfw-prod: add ms-be2057 at object weight 100

https://gerrit.wikimedia.org/r/625604

Change 623966 merged by Filippo Giunchedi:
[operations/puppet@production] statsd_exporter: stop tracking local statsd connections

https://gerrit.wikimedia.org/r/623966

Change 627237 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] ferm: fix keyword in NO_TRACK_R_CLIENT

https://gerrit.wikimedia.org/r/627237

Change 627237 merged by Filippo Giunchedi:
[operations/puppet@production] ferm: fix keyword in NO_TRACK_R_CLIENT

https://gerrit.wikimedia.org/r/627237

Change 627246 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] statsd_exporter: stop tracking local statsd connections

https://gerrit.wikimedia.org/r/627246

Change 627246 merged by Filippo Giunchedi:
[operations/puppet@production] statsd_exporter: stop tracking local statsd connections

https://gerrit.wikimedia.org/r/627246

Change 625604 merged by Filippo Giunchedi:
[operations/software/swift-ring@master] codfw-prod: add ms-be2057 at object weight 100

https://gerrit.wikimedia.org/r/625604

Mentioned in SAL (#wikimedia-operations) [2020-09-15T07:24:23Z] <godog> swift codfw add ms-be2057 at object weight 100 - T261633

Mentioned in SAL (#wikimedia-operations) [2020-09-16T06:28:41Z] <godog> codfw-prod: bump weight for ms-be2057 - T261633

Change 628095 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: bump swift object replicator concurrency

https://gerrit.wikimedia.org/r/628095

Change 628095 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: bump swift object replicator concurrency

https://gerrit.wikimedia.org/r/628095

Mentioned in SAL (#wikimedia-operations) [2020-09-21T08:21:19Z] <godog> swift codfw-prod: bump weight for ms-be2057 - T261633

Change 629082 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: bump Swift object replicator concurrency

https://gerrit.wikimedia.org/r/629082

Status update: the rebalancing is going well and the host is behaving as expected as far as I can tell. With the current capacity we'll be essentially replacing four 12x4TB hosts with a single 24x8TB. The only concern I have so far is that we should be adjusting the RAM as well, perhaps double it, to account for a bigger dataset to be cached there.