Page MenuHomePhabricator

Degraded RAID on ms-be1020
Closed, DuplicatePublic

Description

TASK AUTO-GENERATED by Nagios/Icinga RAID event handler

A degraded RAID (hpssacli) was detected on host ms-be1020. An automatic snapshot of the current RAID status is attached below.

Please sync with the service owner to find the appropriate time window before actually replacing any failed hardware.

CRITICAL: Slot 3: OK: 2I:4:1, 2I:4:2, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 2I:2:1, 2I:2:2, 2I:2:3, 2I:2:4 - Controller: OK - Cache: Permanently Disabled - Cable Error - Battery/Capacitor: Recharging

Smart Array P840 in Slot 3

   array A

      Logical Drive: 1
         Size: 279.4 GB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Disabled
         Unique Identifier: 600508B1001C379646C9E3B9AAA80583
         Disk Name: /dev/sda 
         Mount Points: /srv/swift-storage/sda3 93.1 GB Partition Number 4
         OS Status: LOCKED
         Logical Drive Label: 027D0040PDNNF0ARH9H0GC2A98
         Drive Type: Data
         LD Acceleration Method: HP SSD Smart Path

   array B

      Logical Drive: 2
         Size: 279.4 GB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Disabled
         Unique Identifier: 600508B1001C6D4F4B837E0FA5112B8C
         Disk Name: /dev/sdb 
         Mount Points: /srv/swift-storage/sdb3 93.1 GB Partition Number 4
         OS Status: LOCKED
         Logical Drive Label: 067D0043PDNNF0ARH9H0GC26E4
         Drive Type: Data
         LD Acceleration Method: HP SSD Smart Path

   array C

      Logical Drive: 3
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001CDAC613DE49BA1CD25B98
         Disk Name: /dev/sdc 
         Mount Points: /srv/swift-storage/sdc1 3.6 TB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: 0A7D0055PDNNF0ARH9H0GC1054
         Drive Type: Data
         LD Acceleration Method: Controller Cache

   array D

      Logical Drive: 4
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001CDB0A656800A9BA97FE7D
         Disk Name: /dev/sdd 
         Mount Points: /srv/swift-storage/sdd1 3.6 TB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: 0E7D0057PDNNF0ARH9H0GC69BD
         Drive Type: Data
         LD Acceleration Method: Controller Cache

   array E

      Logical Drive: 5
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001C175BA79906DB412A077C
         Disk Name: /dev/sde 
         Mount Points: /srv/swift-storage/sde1 3.6 TB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: 127D0059PDNNF0ARH9H0GC01D5
         Drive Type: Data
         LD Acceleration Method: Controller Cache

   array F

      Logical Drive: 6
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001C92FFC2856342E26B22F4
         Disk Name: /dev/sdf 
         Mount Points: /srv/swift-storage/sdf1 3.6 TB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: 167D005CPDNNF0ARH9H0GCAFF9
         Drive Type: Data
         LD Acceleration Method: Controller Cache

   array G

      Logical Drive: 7
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001CFCF0A2ACA78C2E098793
         Disk Name: /dev/sdg 
         Mount Points: /srv/swift-storage/sdg1 3.6 TB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: 1A7D005EPDNNF0ARH9H0GC5D2C
         Drive Type: Data
         LD Acceleration Method: Controller Cache

   array H

      Logical Drive: 8
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001CA9FC5F0ED99D98C6DADA
         Disk Name: /dev/sdh 
         Mount Points: /srv/swift-storage/sdh1 3.6 TB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: 1E7D0061PDNNF0ARH9H0GC300D
         Drive Type: Data
         LD Acceleration Method: Controller Cache

   array I

      Logical Drive: 9
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001C4B96A227F0B4910BAC84
         Disk Name: /dev/sdi 
         Mount Points: /srv/swift-storage/sdi1 3.6 TB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: 227D0063PDNNF0ARH9H0GCFF59
         Drive Type: Data
         LD Acceleration Method: Controller Cache

   array J

      Logical Drive: 10
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001C53403494AFD30521D5BE
         Disk Name: /dev/sdj 
         Mount Points: /srv/swift-storage/sdj1 3.6 TB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: 267D0066PDNNF0ARH9H0GCCB86
         Drive Type: Data
         LD Acceleration Method: Controller Cache

   array K

      Logical Drive: 11
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001C214D1E01970742F34219
         Disk Name: /dev/sdk 
         Mount Points: /srv/swift-storage/sdk1 3.6 TB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: 2A7D0068PDNNF0ARH9H0GCBEB0
         Drive Type: Data
         LD Acceleration Method: Controller Cache

   array L

      Logical Drive: 12
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001C4C3C8559D46CC17BE01E
         Disk Name: /dev/sdl 
         Mount Points: /srv/swift-storage/sdl1 3.6 TB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: 2E7D006BPDNNF0ARH9H0GC79D4
         Drive Type: Data
         LD Acceleration Method: Controller Cache

   array M

      Logical Drive: 13
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001C06937B2D2B314B21CE5D
         Disk Name: /dev/sdm 
         Mount Points: /srv/swift-storage/sdm1 3.6 TB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: 327D006EPDNNF0ARH9H0GC8893
         Drive Type: Data
         LD Acceleration Method: Controller Cache

   array N

      Logical Drive: 14
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001C73E447150F2D9547F72F
         Disk Name: /dev/sdn 
         Mount Points: /srv/swift-storage/sdn1 3.6 TB Partition Number 2
         OS Status: LOCKED
         Logical Drive Label: 367D0071PDNNF0ARH9H0GC9227
         Drive Type: Data
         LD Acceleration Method: Controller Cache

Event Timeline

@Cmjohnson @Papaul FYI: given that now the RAID alarm in Icinga can be triggered also for a faulty BBU or wrong WritePolicy, I've added on top of the get raid output the Icinga error.
If the error reports problems related to the BBU or the WritePolicy most likely the output from the disk status will report all ok and not be super helpful.
This is a temporary solution for the moment, until we'll have some time to work on the refactoring/improvement of the raid checks as a whole.

In this case the failing part is:

Cache: Permanently Disabled - Cable Error - Battery/Capacitor: Recharging