Page MenuHomePhabricator

Degraded RAID on sulfur
Closed, ResolvedPublic

Description

TASK AUTO-GENERATED by Nagios/Icinga RAID event handler

A degraded RAID (md) was detected on host sulfur. An automatic snapshot of the current RAID status is attached below.

Please sync with the service owner to find the appropriate time window before actually replacing any failed hardware.

connect to address 208.80.154.87 port 5666: No route to host

$ sudo /usr/local/lib/nagios/plugins/get-raid-status-md
Failed to execute '['/usr/lib/nagios/plugins/check_nrpe', '-4', '-H', 'sulfur', '-c', 'get_raid_status_md']': RETCODE: 2
STDOUT:

STDERR:
None

Related Objects

Event Timeline

This doesn't really tell me anything about the bad disk. I am not able to ssh into the host for more details. I will create a ticket and hope that there is something Dell can use in their TSR

@Volans - hey Riccardo, not sure if you're the right person for this, but thought I'd try asking you. Is there a different output we can get for this alert, to help us isolate the disk issue a bit more?

Thanks,
Willy

@wiki_willy The data gathering failed because of host unreachable, but is this still a commissioned host? I cannot see the records in the DNS repo, just the management ones are there. Also see its decom task: T224475

@Volans - ah that makes. Thanks, let's just resolve out this task then.