Page MenuHomePhabricator

Degraded RAID on logstash2027
Closed, ResolvedPublic

Description

TASK AUTO-GENERATED by Nagios/Icinga RAID event handler

A degraded RAID (md) was detected on host logstash2027. An automatic snapshot of the current RAID status is attached below.

Please sync with the service owner to find the appropriate time window before actually replacing any failed hardware.

CRITICAL: State: degraded, Active: 22, Working: 22, Failed: 2, Spare: 0

$ sudo /usr/local/lib/nagios/plugins/get-raid-status-md
Personalities : [raid0] [raid1] [linear] [multipath] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid1 sdb2[1] sdd2[3] sdh2[7] sdf2[5] sde2[4] sdg2[6] sdc2[2] sda2[0](F)
      78058496 blocks super 1.2 [8/7] [_UUUUUUU]
      
md2 : active raid0 sde4[4] sdb4[1] sdd4[3] sda4[0] sdc4[2] sdf4[5] sdh4[7] sdg4[6]
      6865133568 blocks super 1.2 512k chunks
      
md1 : active raid1 sde3[4] sdf3[5] sdb3[1] sdc3[2] sdh3[7] sda3[0](F) sdd3[3] sdg3[6]
      999424 blocks super 1.2 [8/7] [_UUUUUUU]
      
unused devices: <none>

Event Timeline

colewhite subscribed.

The cluster will remain in a degraded state until replacements are installed. Please replace the failed disks as soon as possible. Thanks!

Mentioned in SAL (#wikimedia-operations) [2022-09-06T18:25:47Z] <cwhite> reduce codfw replicas 2 to 1 for logstash-(webrequest|k8s) partitions. Make space for failed logstash2027 - T316996

Create Dispatch: Success
You have successfully submitted request SR151072219.

Mentioned in SAL (#wikimedia-operations) [2022-09-12T17:08:53Z] <cwhite> rebuilt raid on logstash2027 T316996