Predictive failures on disk S.M.A.R.T. status
Open, LowPublic

Description

We have a bunch of predictive failures which should be taken care of - however it is not worth of replace those disks until actual failure.
I keep this list updated.

  • db2049 s2
  • db2050 s3
  • db2051 s4
  • db2053 s6
  • db2061 s7
  • db2044 m2
  • db2047 s7 T212966
  • db1073 m5 T215050
  • db1065 m2
  • db1063 m1 T211537
Banyek created this task.Oct 30 2018, 3:02 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 30 2018, 3:02 PM
Banyek triaged this task as Low priority.Oct 30 2018, 3:02 PM
jcrespo moved this task from Triage to Backlog on the DBA board.Oct 30 2018, 3:02 PM
Banyek moved this task from Backlog to In progress on the DBA board.Oct 30 2018, 3:17 PM
Banyek updated the task description. (Show Details)Nov 6 2018, 9:23 AM
Banyek added a subscriber: Papaul.Nov 7 2018, 3:54 PM

Once the disk have failed we will get an automatic ticket for getting that disk replaced. I don't think we need this tracking taks.

Already caught up with Jaime about why this ticket exists. All good here

db2044 came up with predictive failure today:

root@db2044:~# hpssacli controller all show config

Smart Array P420i in Slot 0 (Embedded)    (sn: 0014380264FFFB0)


   Port Name: 1I

   Port Name: 2I

   Gen8 ServBP 12+2 at Port 1I, Box 1, OK
   array A (SAS, Unused Space: 0  MB)


      logicaldrive 1 (3.3 TB, RAID 1+0, OK)

      physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 600 GB, OK)
      physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 600 GB, OK)
      physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SAS, 600 GB, Predictive Failure)
      physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SAS, 600 GB, OK)
      physicaldrive 1I:1:5 (port 1I:box 1:bay 5, SAS, 600 GB, OK)
      physicaldrive 1I:1:6 (port 1I:box 1:bay 6, SAS, 600 GB, OK)
      physicaldrive 1I:1:7 (port 1I:box 1:bay 7, SAS, 600 GB, OK)
      physicaldrive 1I:1:8 (port 1I:box 1:bay 8, SAS, 600 GB, OK)
      physicaldrive 1I:1:9 (port 1I:box 1:bay 9, SAS, 600 GB, OK)
      physicaldrive 1I:1:10 (port 1I:box 1:bay 10, SAS, 600 GB, OK)
      physicaldrive 1I:1:11 (port 1I:box 1:bay 11, SAS, 600 GB, OK)
      physicaldrive 1I:1:12 (port 1I:box 1:bay 12, SAS, 600 GB, OK)
Marostegui updated the task description. (Show Details)Nov 21 2018, 7:25 AM
Marostegui updated the task description. (Show Details)Nov 21 2018, 7:37 AM
Marostegui updated the task description. (Show Details)

db2044 got its disk replaced but came up with predictive failure (T210049#4767169)

Banyek updated the task description. (Show Details)Dec 3 2018, 9:16 AM

db1063

name: Adapter #0

	Virtual Drive: 0 (Target Id: 0)
	RAID Level: Primary-1, Secondary-0, RAID Level Qualifier-0
	State: Optimal
	Number Of Drives per span: 2
	Number of Spans: 6
	Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU

		Span: 3 - Number of PDs: 2

			PD: 1 Information
			Enclosure Device ID: 32
			Slot Number: 7
			Drive's position: DiskGroup: 0, Span: 3, Arm: 1
			Media Error Count: 2
			Other Error Count: 0
			Predictive Failure Count: =====> 1 <=====
			Last Predictive Failure Event Seq Number: 2776

				Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
				Firmware state: Online, Spun Up
				Media Type: Hard Disk Device
				Drive Temperature: 34C (93.20 F)
Marostegui updated the task description. (Show Details)Dec 10 2018, 6:35 AM
Marostegui updated the task description. (Show Details)Jan 1 2019, 12:49 PM
Marostegui updated the task description. (Show Details)Jan 4 2019, 7:44 PM
Marostegui updated the task description. (Show Details)Jan 8 2019, 2:30 PM
Marostegui updated the task description. (Show Details)Mon, Jan 21, 4:54 PM
Marostegui updated the task description. (Show Details)
Marostegui updated the task description. (Show Details)Fri, Feb 8, 6:11 AM
Marostegui updated the task description. (Show Details)Tue, Feb 12, 6:40 AM