Page MenuHomePhabricator

dbstore1002 disk errors
Closed, ResolvedPublic

Description

                Device Present
                ================
Virtual Drives    : 1 
  Degraded        : 0 
  Offline         : 0 
Physical Devices  : 14 
  Disks           : 12 
  Critical Disks  : 1 
  Failed Disks    : 0 


Enclosure Device ID: 32
Slot Number: 6
Drive's position: DiskGroup: 0, Span: 3, Arm: 0
Enclosure position: 1
Device Id: 6
WWN: 5000C500710D4C20
Sequence Number: 2
Media Error Count: 777
Other Error Count: 2313
Predictive Failure Count: 49
Last Predictive Failure Event Seq Number: 6028
PD Type: SAS

Raw Size: 1.090 TB [0x8bba0cb0 Sectors]
Non Coerced Size: 1.090 TB [0x8baa0cb0 Sectors]
Coerced Size: 1.090 TB [0x8ba80000 Sectors]
Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: IS04
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c500710d4c21
SAS Address(1): 0x0
Connected Port Number: 0(path0) 
Inquiry Data: SEAGATE ST1200MM0007    IS04S3L03SAK            
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :31C (87.80 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : Yes

Related Objects

Event Timeline

Restricted Application added subscribers: Zppix, Southparkfan, Aklapper. · View Herald Transcript

Mentioned in SAL [2016-07-14T08:18:51Z] <jynus> running "megacli -PDOffline -PhysDrv '[32:6]' -aALL" on dbstore1002 to debug issue T140337

It is not the disk, I am going to rebuild it into the RAID.

On rebuild I am getting more and more media/other/predictive errors, I think the drive should still be replaced, but @Cmjohnson has the last word on this. I will deal with the lag on a separate ticket.

jcrespo renamed this task from dbstore1002 disk failure causing lag to dbstore1002 disk errors.Jul 14 2016, 9:01 AM

I think I killed the drive for good:

Rebuild Progress on Device at Enclosure 32, Slot 6 Completed 0% in 38 Minutes.

Media Error Count: 777
Other Error Count: 2313
Predictive Failure Count: 50
Last Predictive Failure Event Seq Number: 6188

I submitted a work order for a new disk

Congratulations: Work Order SR932776781 was successfully submitted.

elukey triaged this task as High priority.Jul 19 2016, 12:42 PM

Replaced the disk, it's rebuilding

Enclosure Device ID: 32
Slot Number: 6
Drive's position: DiskGroup: 0, Span: 3, Arm: 0
Enclosure position: 1
Device Id: 6
WWN: 5000C50096BF9FDC
Sequence Number: 16
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 1.090 TB [0x8bba0cb0 Sectors]
Non Coerced Size: 1.090 TB [0x8baa0cb0 Sectors]
Coerced Size: 1.090 TB [0x8ba80000 Sectors]
Sector Size: 0
Firmware state: Rebuild
Device Firmware Level: TS

Return shipping information

USPS: 9202 3946 5301 2421 2675 30
FEDEX: 9611918 2393026 70517551

Rebuild is extremely slow....

Rebuild Progress on Device at Enclosure 32, Slot 6 Completed 45% in 391 Minutes.

I had that very same problem with the old disk, but I assumed it was because it had failed. :-( Let me see if I see anything else bad.

megacli -PDRbld -ShowProg -PhysDrv'[32:6]' -a0
                                     
Rebuild Progress on Device at Enclosure 32, Slot 6 Completed 98% in 908 Minutes.

It finished, no media errors.