- - Provide FQDN of system.
- - If other than a hard drive issue, please depool the machine (and confirm that it’s been depooled) for us to work on it. If not, please provide time frame for us to take the machine down.
- - Put system into a failed state in Netbox.
- - Provide urgency of request, along with justification (redundancy, dependencies, etc)
- - Describe issue and/or attach hardware failure log. (Refer to https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook if you need help)
- - Assign correct project tag and appropriate owner (based on above). Also, please ensure the service owners of the host(s) are added as subscribers to provide any additional input.
FQDN: wikikube-worker1256.eqiad.wmnet
Urgency: Medium, one of many wikikube nodes
TASK AUTO-GENERATED by Nagios/Icinga RAID event handler
A degraded RAID (md) was detected on host wikikube-worker1256. An automatic snapshot of the current RAID status is attached below.
Please sync with the service owner to find the appropriate time window before actually replacing any failed hardware.
CRITICAL: State: degraded, Active: 1, Working: 1, Failed: 0, Spare: 0 $ sudo /usr/local/lib/nagios/plugins/get-raid-status-md Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md0 : active raid1 sdb2[1] 937267200 blocks super 1.2 [2/1] [_U] bitmap: 4/7 pages [16KB], 65536KB chunk unused devices: <none>