Page MenuHomePhabricator

es2010 failed disk (degraded RAID)
Closed, DuplicatePublic

Description

                Device Present
                ================
Virtual Drives    : 1 
  Degraded        : 1 
  Offline         : 0 
Physical Devices  : 14 
  Disks           : 12 
  Critical Disks  : 3 
  Failed Disks    : 1
Enclosure Device ID: 32
Slot Number: 6
Drive's position: DiskGroup: 0, Span: 3, Arm: 0
Enclosure position: N/A
Device Id: 6
WWN: 5000C50054EA0FE0
Sequence Number: 3
Media Error Count: 488
Other Error Count: 6
Predictive Failure Count: 49
Last Predictive Failure Event Seq Number: 10855
PD Type: SAS

Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.375 GB [0x45cc0000 Sectors]
Sector Size:  0
Firmware state: Failed
Device Firmware Level: ES65
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c50054ea0fe1
SAS Address(1): 0x0
Connected Port Number: 0(path0) 
Inquiry Data: SEAGATE ST3600057SS     ES656SL4GHL0            
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :40C (104.00 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : Yes

There are also other disks with errors (but not fully failed):

Enclosure Device ID: 32
Slot Number: 5
Drive's position: DiskGroup: 0, Span: 2, Arm: 1
Enclosure position: N/A
Device Id: 5
WWN: 5000C50054E9EE28
Sequence Number: 2
Media Error Count: 53
Other Error Count: 0
Predictive Failure Count: 125
Last Predictive Failure Event Seq Number: 10847
PD Type: SAS

Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.375 GB [0x45cc0000 Sectors]
Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: ES65
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c50054e9ee29
SAS Address(1): 0x0
Connected Port Number: 0(path0) 
Inquiry Data: SEAGATE ST3600057SS     ES656SL4FTQA            
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :42C (107.60 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : Yes
Enclosure Device ID: 32
Slot Number: 3
Drive's position: DiskGroup: 0, Span: 1, Arm: 1
Enclosure position: N/A
Device Id: 3
WWN: 5000C50054EAA2F4
Sequence Number: 2
Media Error Count: 552
Other Error Count: 0
Predictive Failure Count: 118
Last Predictive Failure Event Seq Number: 10846
PD Type: SAS

Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.375 GB [0x45cc0000 Sectors]
Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: ES65
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c50054eaa2f5
SAS Address(1): 0x0
Connected Port Number: 0(path0) 
Inquiry Data: SEAGATE ST3600057SS     ES656SL4H4GM            
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :42C (107.60 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : Yes

Related Objects

Event Timeline

jcrespo raised the priority of this task from to Needs Triage.
jcrespo updated the task description. (Show Details)
jcrespo added a subscriber: jcrespo.

This server is out of warranty. Is there a plan to replace these in the near term? I can send spare disks from eqiad to codfw if needed.

To purchase new disks replacements from newegg, the disks are Approx $244.00 each. I have 8 decommissioned ES hosts in eqiad that have those disks. I can send a dozen or so disks to codfw for the cost of shipping which I estimate not be more than $50 depending on speed of delivery.

If the plan is to swap out the older ES servers next fiscal, I don't think buying new disks makes sense considering we have plenty of used spares to keep the disks spinning until we eventually power off for good.

If we do have spare disks that are not needed elsewhere then sure, let's send them. But make sure eqiad keeps sufficient spares itself. :)

fgiunchedi triaged this task as Medium priority.Dec 1 2015, 12:26 PM
fgiunchedi added a subscriber: fgiunchedi.

@Cmjohnson We have mark's approval and Papaul is back, do you need something that I can help with to organize the shipping?

Aside from Mark's advice, I would add safe-deleting the disks before shipping.

@Cmjohnson, do you need something for doing this? We have now 4 failed disks at the other datacenter (if this cannot happen, we will just buy them).

Papaul I a going to take this until I send you the disks later this week.

Re-assigning back to papaul. I did not have the disks here