Page MenuHomePhabricator

Degraded RAID on db1154
Closed, ResolvedPublic

Description

TASK AUTO-GENERATED by Nagios/Icinga RAID event handler

A degraded RAID (megacli) was detected on host db1154. An automatic snapshot of the current RAID status is attached below.

Please sync with the service owner to find the appropriate time window before actually replacing any failed hardware.

CRITICAL: 1 failed LD(s) (Degraded)

$ sudo /usr/local/lib/nagios/plugins/get-raid-status-megacli
Failed to execute '['/usr/lib/nagios/plugins/check_nrpe', '-4', '-H', 'db1154', '-c', 'get_raid_status_megacli']': 'utf-8' codec can't decode byte 0x9c in position 1: invalid start byte

Event Timeline

Marostegui triaged this task as Medium priority.Mar 21 2023, 6:02 AM
Marostegui added a project: DBA.
Marostegui added a subscriber: wiki_willy.

@wiki_willy I don't think this host is in under guarantee, however it is an important host for us. Any chances we can get a new (or spare) disk for it? I think it is meant to be refreshed next FY.

Cmjohnson subscribed.

A new SSD has been requested from Dell.

You have successfully submitted request SR164648098.

@Marostegui the disk has been replaced, I did not add it back to the raid configuration. Please do so at your convenience.

RAID being rebuilt:

root@db1154:~# megacli -PDRbld -ShowProg -physdrv[32:9] -aALL

Rebuild Progress on Device at Enclosure 32, Slot 9 Completed 4% in 7 Minutes.

Exit Code: 0x00
root@db1154:~#