Page MenuHomePhabricator

Disk failure on labsdb1005
Closed, ResolvedPublic


There's a disk marked critical in icinga that I cannot find a ticket for. Presuming it's not there, here's one.
This is a Dell system. The output from megacli is:

Enclosure Device ID: 32
Slot Number: 8
Drive's position: DiskGroup: 0, Span: 0, Arm: 7
Enclosure position: N/A
Device Id: 8
WWN: 5000C5004125A53C
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 2
Last Predictive Failure Event Seq Number: 2513893
PD Type: SAS

Raw Size: 1.819 TB [0xe8e088b0 Sectors]
Non Coerced Size: 1.818 TB [0xe8d088b0 Sectors]
Coerced Size: 1.818 TB [0xe8d00000 Sectors]
Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: PS04
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c5004125a53d
SAS Address(1): 0x0
Connected Port Number: 0(path0)
Inquiry Data: SEAGATE ST2000NM0001    PS04Z1P1J3LQ
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive Temperature :35C (95.00 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: Unknown
Drive has flagged a S.M.A.R.T alert : Yes

Not pasting the events log unless you really want it:
Adapter: 0 - Number of Events : 59388

This system is not under warranty from information in netbox.

Event Timeline

Manually dropped the disk from the RAID because the server is being very bad. The database keeps locking up.

colewhite triaged this task as Medium priority.Feb 15 2019, 7:34 PM

@Bstorm I swapped the disk with a used spare but this server really needs to be decommissioned...the warranty expired in 2015.

Bstorm claimed this task.

T216749 working on it as soon as we don't need it for restoring some tables anymore! Soon.

Thanks, looks good.