Page MenuHomePhabricator

Degraded RAID on db1067
Closed, ResolvedPublic

Description

TASK AUTO-GENERATED by Nagios/Icinga RAID event handler

A degraded RAID (megacli) was detected on host db1067. An automatic snapshot of the current RAID status is attached below.

Please sync with the service owner to find the appropriate time window before actually replacing any failed hardware.

CRITICAL: 1 failed LD(s) (Degraded)

$ sudo /usr/local/lib/nagios/plugins/get_raid_status_megacli
=== RaidStatus (does not include components in optimal state)
name: Adapter #0

	Virtual Drive: 0 (Target Id: 0)
	RAID Level: Primary-1, Secondary-0, RAID Level Qualifier-0
	State: =====> Degraded <=====
	Number Of Drives per span: 6
	Number of Spans: 2
	Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU

		Span: 1 - Number of PDs: 6

			PD: 1 Information
			Enclosure Device ID: 32
			Slot Number: 7
			Drive's position: DiskGroup: 0, Span: 1, Arm: 1
			Media Error Count: 913
			Other Error Count: 18
			Predictive Failure Count: 0
			Last Predictive Failure Event Seq Number: 0

				Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
				Firmware state: =====> Failed <=====
				Media Type: Hard Disk Device
				Drive Temperature: 34C (93.20 F)

=== RaidStatus completed

Event Timeline

Marostegui triaged this task as High priority.
Marostegui added a project: DBA.
Marostegui added a subscriber: Cmjohnson.

@Cmjohnson can we replace this as soon as possible? This is enwiki primary master

Failed:

PD: 1 Information
Enclosure Device ID: 32
Slot Number: 7
Drive's position: DiskGroup: 0, Span: 1, Arm: 1
Enclosure position: 1
Device Id: 7
WWN: 5000C50070CACB6C
Sequence Number: 22
Media Error Count: 1
Other Error Count: 3
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.375 GB [0x45cc0000 Sectors]
Sector Size:  0
Firmware state: Failed

Can you pull out wait a minute and then in again?

Reseated the disk....let's see what happens

So after replacing the disk 3 times yesterday evening...we finally got this fixed!
Thanks a lot Chris!

Number of Virtual Disks: 1
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 3.271 TB
Sector Size         : 512
Mirror Data         : 3.271 TB
State               : Optimal
Strip Size          : 256 KB
Number Of Drives per span:6