Page MenuHomePhabricator

Degraded RAID on db1049
Closed, ResolvedPublic

Description

TASK AUTO-GENERATED by Nagios/Icinga RAID event handler

A degraded RAID (megacli) was detected on host db1049. An automatic snapshot of the current RAID status is attached below.

Please sync with the service owner to find the appropriate time window before actually replacing any failed hardware.

=== RaidStatus (does not include components in optimal state)
name: Adapter #0

	Virtual Drive: 0 (Target Id: 0)
	RAID Level: Primary-1, Secondary-0, RAID Level Qualifier-0
	State: =====> Degraded <=====
	Number Of Drives per span: 2
	Number of Spans: 6
	Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU

		Span: 2 - Number of PDs: 2

			PD: 0 Information
			Enclosure Device ID: 32
			Slot Number: 4
			Drive's position: DiskGroup: 0, Span: 2, Arm: 0
			Media Error Count: 50851
			Other Error Count: 11
			Predictive Failure Count: =====> 382 <=====
			Last Predictive Failure Event Seq Number: 3912

				Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
				Firmware state: =====> Failed <=====
				Media Type: Hard Disk Device
				Drive Temperature: 43C (109.40 F)

=== RaidStatus completed

Event Timeline

Marostegui triaged this task as High priority.

This is correct, that disk is broken:

Enclosure Device ID: 32
Slot Number: 4
Drive's position: DiskGroup: 0, Span: 2, Arm: 0
Enclosure position: N/A
Device Id: 4
WWN: 5000C500479430C0
Sequence Number: 10
Media Error Count: 50851
Other Error Count: 11
Predictive Failure Count: 382
Last Predictive Failure Event Seq Number: 3912
PD Type: SAS

Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.375 GB [0x45cc0000 Sectors]
Sector Size:  0
Firmware state: Failed

And the raid is degraded:

root@db1049:~# megacli -LDInfo -L0 -a0


Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 3.271 TB
Sector Size         : 512
Mirror Data         : 3.271 TB
State               : Degraded

@Cmjohnson this is a master, can we get it replaced as soon as possible?

The disk has been swapped and is rebuilding

Enclosure Device ID: 32
Slot Number: 4
Drive's position: DiskGroup: 0, Span: 2, Arm: 0
Enclosure position: N/A
Device Id: 4
WWN: 5000C5005E8529B0
Sequence Number: 18
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.375 GB [0x45cc0000 Sectors]
Sector Size: 0
Firmware state: Rebuild
Device Firmware Level: 0008

Thanks Chris, I will keep an eye on it and close the ticket once it is finished!

All good now!
Thanks!

root@db1049:~# megacli -PDRbld -ShowProg -PhysDrv [32:4] -aALL

Device(Encl-32 Slot-4) is not in rebuild process

Exit Code: 0x00
root@db1049:~# megacli -LDInfo -L0 -a0


Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 3.271 TB
Sector Size         : 512
Mirror Data         : 3.271 TB
State               : Optimal