Page MenuHomePhabricator

db1021 degraded RAID
Closed, ResolvedPublic

Description

                Device Present
                ================
Virtual Drives    : 1 
  Degraded        : 1 
  Offline         : 0 
Physical Devices  : 14 
  Disks           : 12 
  Critical Disks  : 2 
  Failed Disks    : 1
Enclosure Device ID: 32
Slot Number: 8
Drive's position: DiskGroup: 0, Span: 4, Arm: 0
Enclosure position: N/A
Device Id: 8
WWN: 5000C50032410470
Sequence Number: 3
Media Error Count: 32
Other Error Count: 10
Predictive Failure Count: 13
Last Predictive Failure Event Seq Number: 38697
PD Type: SAS

Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.875 GB [0x22dc0000 Sectors]
Sector Size:  0
Firmware state: Failed
Device Firmware Level: ES64
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c50032410471
SAS Address(1): 0x0
Connected Port Number: 0(path0) 
Inquiry Data: SEAGATE ST3300657SS     ES646SJ0FREV            
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :36C (96.80 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : Yes

Hopefully there are some spares left.

Related Objects

Event Timeline

jcrespo raised the priority of this task from to Needs Triage.
jcrespo updated the task description. (Show Details)
jcrespo added a project: ops-eqiad.
jcrespo subscribed.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald Transcript

There is also the previous disk with a SMART alert:

Enclosure Device ID: 32
Slot Number: 7
Drive's position: DiskGroup: 0, Span: 3, Arm: 1
Enclosure position: N/A
Device Id: 7
WWN: 5000C5003240F2AC
Sequence Number: 2
Media Error Count: 2
Other Error Count: 0
Predictive Failure Count: 9
Last Predictive Failure Event Seq Number: 38635
PD Type: SAS

Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.875 GB [0x22dc0000 Sectors]
Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: ES64
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c5003240f2ad
SAS Address(1): 0x0
Connected Port Number: 0(path0) 
Inquiry Data: SEAGATE ST3300657SS     ES646SJ0GGB3            
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :35C (95.00 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : Yes

This is fixed, but shorty after, it crashed with corrupted InnoDB pages. Related?

It is a possibility...i used a "used disk" Do you want to replace it with another?

Not really- I reimagined. I would understand the disk braking, but the controler should have managed it, not create corruption at application level.

Let me check the RAID status, to see if there is any media error, and I would close the ticket otherwise.

These servers will be the first to be decommissioned once the new ones arrive.

Could you "replace" these 2 disks which are marked as critical (it is ok to waste two old disks on this, it will be decommissioned soon- replacement is on its way), but I need to check that data there is consistent- and throw away "bad" disks:

Enclosure Device ID: 32
Slot Number: 7
Drive's position: DiskGroup: 0, Span: 3, Arm: 1
Enclosure position: N/A
Device Id: 7
WWN: 5000C5003240F2AC
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 7
Last Predictive Failure Event Seq Number: 45526
PD Type: SAS

Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.875 GB [0x22dc0000 Sectors]
Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: ES64
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c5003240f2ad
SAS Address(1): 0x0
Connected Port Number: 0(path0) 
Inquiry Data: SEAGATE ST3300657SS     ES646SJ0GGB3            
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :34C (93.20 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : Yes



Enclosure Device ID: 32
Slot Number: 8
Drive's position: DiskGroup: 0, Span: 4, Arm: 0
Enclosure position: N/A
Device Id: 8
WWN: 5000C50028EA079C
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 7
Last Predictive Failure Event Seq Number: 45527
PD Type: SAS

Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.875 GB [0x22dc0000 Sectors]
Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: ES64
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c50028ea079d
SAS Address(1): 0x0
Connected Port Number: 0(path0) 
Inquiry Data: SEAGATE ST3300657SS     ES643SJ3EVDY            
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :33C (91.40 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : Yes

They do not show media errors, but show smart alerts.

No hurry, at least not until the replacement arrives.

Replaced disk 7 and it's back online
disk 8 is rebuilding now

both disk replaced and back to normal.