Page MenuHomePhabricator

Degraded RAID on db2060
Closed, DeclinedPublic

Description

TASK AUTO-GENERATED by Nagios/Icinga RAID event handler

A degraded RAID (hpssacli) was detected on host db2060. An automatic snapshot of the current RAID status is attached below.

Please sync with the service owner to find the appropriate time window before actually replacing any failed hardware.

CRITICAL: Slot 0: OK: 1I:1:1, 1I:1:10, 1I:1:11, 1I:1:12, 1I:1:2, 1I:1:3, 1I:1:4, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:9 - Failed: 1I:1:5 - Controller: OK - Battery/Capacitor: OK

$ sudo /usr/local/lib/nagios/plugins/get-raid-status-hpssacli

Smart Array P420i in Slot 0 (Embedded)

   array A

      Logical Drive: 1
         Size: 3.3 TB
         Fault Tolerance: 1+0
         Strip Size: 256 KB
         Full Stripe Size: 1536 KB
         Status: Interim Recovery Mode
         Caching:  Enabled
         Disk Name: /dev/sda 
         Mount Points: / 37.3 GB Partition Number 2
         OS Status: LOCKED
         Mirror Group 1:
            physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 600 GB, OK)
            physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 600 GB, OK)
            physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SAS, 600 GB, OK)
            physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SAS, 600 GB, OK)
            physicaldrive 1I:1:5 (port 1I:box 1:bay 5, SAS, 600 GB, Failed)
            physicaldrive 1I:1:6 (port 1I:box 1:bay 6, SAS, 600 GB, OK)
         Mirror Group 2:
            physicaldrive 1I:1:7 (port 1I:box 1:bay 7, SAS, 600 GB, OK)
            physicaldrive 1I:1:8 (port 1I:box 1:bay 8, SAS, 600 GB, OK)
            physicaldrive 1I:1:9 (port 1I:box 1:bay 9, SAS, 600 GB, OK)
            physicaldrive 1I:1:10 (port 1I:box 1:bay 10, SAS, 600 GB, OK)
            physicaldrive 1I:1:11 (port 1I:box 1:bay 11, SAS, 600 GB, OK)
            physicaldrive 1I:1:12 (port 1I:box 1:bay 12, SAS, 600 GB, OK)
         Drive Type: Data
         LD Acceleration Method: Controller Cache

Event Timeline

Restricted Application added subscribers: Marostegui, Aklapper. · View Herald TranscriptSep 10 2019, 4:56 PM
wiki_willy reassigned this task from Cmjohnson to Papaul.Sep 10 2019, 6:21 PM
wiki_willy added a subscriber: Cmjohnson.
wiki_willy added a subscriber: wiki_willy.

Looks like the warranty expired on Jan. 14, 2018. @Papaul - let me know if you have any spares lying around or if we need to purchase a new disk. Thanks, Willy

Marostegui closed this task as Declined.Sep 10 2019, 6:55 PM

There is no need to replace this disk. This host is pending DC-Ops steps for decommissioning T231625

Papaul reopened this task as Open.Sep 10 2019, 6:58 PM

Resolving this . Host will be decom in T231625

Marostegui closed this task as Declined.Sep 10 2019, 7:11 PM

Resolving this . Host will be decom in T231625

You just reopened! :P