Page MenuHomePhabricator

Degraded RAID on db1046
Closed, ResolvedPublic

Description

TASK AUTO-GENERATED by Nagios/Icinga RAID event handler

A degraded RAID was detected on host db1046. An automatic snapshot of the current RAID status is attached below.

Please sync with the service owner to find the appropriate time window before actually replacing any failed hardware.

=== RaidStatus (does not include components in optimal state)
name: Adapter #0

	Virtual Drive: 0 (Target Id: 0)
	RAID Level: Primary-1, Secondary-0, RAID Level Qualifier-0
	State: =====> Degraded <=====
	Number Of Drives per span: 2
	Number of Spans: 6
	Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU

		Span: 0 - Number of PDs: 2

			PD: 0 Information
			Enclosure Device ID: 32
			Slot Number: 0
			Drive's position: DiskGroup: 0, Span: 0, Arm: 0
			Media Error Count: =====> 149 <=====
			Other Error Count: =====> 20 <=====
			Predictive Failure Count: =====> 138 <=====
			Last Predictive Failure Event Seq Number: =====> 38927 <=====

				Raw Size: 279.396 GB [0x22ecb25c Sectors]
				Firmware state: =====> Failed <=====
				Media Type: Hard Disk Device
				Drive Temperature: 44C (111.20 F)

		Span: 1 - Number of PDs: 2

			PD: 1 Information
			Enclosure Device ID: 32
			Slot Number: 3
			Drive's position: DiskGroup: 0, Span: 1, Arm: 1
			Media Error Count: =====> 37 <=====
			Other Error Count: 0
			Predictive Failure Count: =====> 180 <=====
			Last Predictive Failure Event Seq Number: =====> 38928 <=====

				Raw Size: 279.396 GB [0x22ecb25c Sectors]
				Firmware state: Online, Spun Up
				Media Type: Hard Disk Device
				Drive Temperature: 45C (113.00 F)

		Span: 4 - Number of PDs: 2

			PD: 1 Information
			Enclosure Device ID: 32
			Slot Number: 9
			Drive's position: DiskGroup: 0, Span: 4, Arm: 1
			Media Error Count: 0
			Other Error Count: =====> 1 <=====
			Predictive Failure Count: 0
			Last Predictive Failure Event Seq Number: 0

				Raw Size: 279.396 GB [0x22ecb25c Sectors]
				Firmware state: Online, Spun Up
				Media Type: Hard Disk Device
				Drive Temperature: 39C (102.20 F)

		Span: 5 - Number of PDs: 2

			PD: 1 Information
			Enclosure Device ID: 32
			Slot Number: 11
			Drive's position: DiskGroup: 0, Span: 5, Arm: 1
			Media Error Count: =====> 4 <=====
			Other Error Count: 0
			Predictive Failure Count: 0
			Last Predictive Failure Event Seq Number: 0

				Raw Size: 279.396 GB [0x22ecb25c Sectors]
				Firmware state: Online, Spun Up
				Media Type: Hard Disk Device
				Drive Temperature: 38C (100.40 F)

=== RaidStatus completed

Event Timeline

elukey triaged this task as High priority.Oct 20 2016, 1:29 PM
Cmjohnson claimed this task.
Cmjohnson subscribed.

The disk was replaced yesterday. All systems go

root@db1046:~# megacli -PDList -aALL |grep "Firmware state"
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up