Page MenuHomePhabricator

ms-be2018 sdc unreadable sector
Closed, ResolvedPublic

Description

Got this on sdc on ms-be2018

[1253680.896930] blk_update_request: critical medium error, dev sdc, sector 29967904
[1253680.935044] XFS (sdc1): metadata I/O error: block 0x1c94420 ("xfs_trans_read_buf_map") error 61 numblks 16
[1253680.985248] XFS (sdc1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -61.
[1253681.061231] sd 0:1:0:2: [sdc] tag#9 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[1253681.061246] sd 0:1:0:2: [sdc] tag#9 Sense Key : Medium Error [current] 
[1253681.061247] sd 0:1:0:2: [sdc] tag#9 Add. Sense: Unrecovered read error
[1253681.061250] sd 0:1:0:2: [sdc] tag#9 CDB: Read(16) 88 00 00 00 00 00 01 c9 46 20 00 00 00 10 00 00
[1253681.061252] blk_update_request: critical medium error, dev sdc, sector 29967904
[1253681.098813] XFS (sdc1): metadata I/O error: block 0x1c94420 ("xfs_trans_read_buf_map") error 61 numblks 16
[1253681.149788] XFS (sdc1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -61.
[1253681.153600] sd 0:1:0:2: [sdc] tag#10 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[1253681.153604] sd 0:1:0:2: [sdc] tag#10 Sense Key : Medium Error [current] 
[1253681.153607] sd 0:1:0:2: [sdc] tag#10 Add. Sense: Unrecovered read error
[1253681.153768] sd 0:1:0:2: [sdc] tag#10 CDB: Read(16) 88 00 00 00 00 00 01 c9 46 20 00 00 00 10 00 00
[1253681.153771] blk_update_request: critical medium error, dev sdc, sector 29967904
[1253681.193939] XFS (sdc1): metadata I/O error: block 0x1c94420 ("xfs_trans_read_buf_map") error 61 numblks 16
[1253681.243671] XFS (sdc1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -61.
[1256194.315755] XFS (sdc1): Unmounting Filesystem
[1256257.495968]  sdc: sdc1
[1256257.551790]  sdc: sdc1
[1256257.808014] XFS (sdc1): Mounting V4 Filesystem
[1256258.018389] XFS (sdc1): Ending clean mount
[1264045.229079] Process accounting resumed
[1266462.161822] sd 0:1:0:2: [sdc] tag#25 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[1266462.161826] sd 0:1:0:2: [sdc] tag#25 Sense Key : Medium Error [current] 
[1266462.161828] sd 0:1:0:2: [sdc] tag#25 Add. Sense: Unrecovered read error
[1266462.161831] sd 0:1:0:2: [sdc] tag#25 CDB: Read(16) 88 00 00 00 00 00 01 c9 46 20 00 00 00 10 00 00
[1266462.161832] blk_update_request: critical medium error, dev sdc, sector 29967904
[1266462.199375] XFS (sdc1): metadata I/O error: block 0x1c94420 ("xfs_trans_read_buf_map") error 61 numblks 16
[1266462.249089] XFS (sdc1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -61.
[1270707.824715] XFS (sdc1): Unmounting Filesystem
[1271077.304339]  sdc: sdc1
[1271077.356959]  sdc: sdc1
[1271077.588384] XFS (sdc1): Mounting V4 Filesystem
[1271077.764019] XFS (sdc1): Ending clean mount
[1278338.521169] sd 0:1:0:2: [sdc] tag#16 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[1278338.521172] sd 0:1:0:2: [sdc] tag#16 Sense Key : Medium Error [current] 
[1278338.521174] sd 0:1:0:2: [sdc] tag#16 Add. Sense: Unrecovered read error
[1278338.521177] sd 0:1:0:2: [sdc] tag#16 CDB: Read(16) 88 00 00 00 00 00 01 c9 46 20 00 00 00 10 00 00
[1278338.521178] blk_update_request: critical medium error, dev sdc, sector 29967904
[1278338.560014] XFS (sdc1): metadata I/O error: block 0x1c94420 ("xfs_trans_read_buf_map") error 61 numblks 16
[1278338.611190] XFS (sdc1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -61.

Although the controller thinks the disk is fine. I've blinked the drive

=> ld 3 show

Smart Array P840 in Slot 3

   array C

      Logical Drive: 3
         Size: 3.6 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: OK
         MultiDomain Status: OK
         Caching:  Enabled
         Unique Identifier: 600508B1001C75465B7444FD0DFD4FE9
         Disk Name: /dev/sdc 
         Mount Points: None
         Drive Type: Data
         LD Acceleration Method: Controller Cache

=> ld 3 modify led=on
=>

@Papaul please order / replace this disk when you get a chance!

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 12 2019, 2:23 PM
fgiunchedi updated the task description. (Show Details)
fgiunchedi added a subscriber: Papaul.
Restricted Application added a project: Operations. · View Herald TranscriptJun 12 2019, 2:24 PM

Also forcibly remove the physical disk

   array C

      physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SATA, 4000.7 GB, OK)
=> pd 1I:1:1 modify disablepd

Warning: The physical drive will be disabled until the next reboot, if the
         drive is replaced, or if the drive is hot plugged. The failed drive
         LED will display amber while the drive is in this state. Continue?
         (y/n)y
ArielGlenn triaged this task as Normal priority.Jun 13 2019, 7:21 AM
fgiunchedi assigned this task to Papaul.Jul 5 2019, 12:56 PM

@Papaul please order / replace this disk when you get a chance!

Papaul added a comment.Jul 5 2019, 3:21 PM

@fgiunchedi this server is out of warranty since October 2018. We have no 4TB disks on site.

@fgiunchedi this server is out of warranty since October 2018. We have no 4TB disks on site.

Ok! I'd like to request ordering of 4TB disk (or multiple? we have already a bunch of ms-be hosts OOW), let me know how I can help

Papaul added a comment.Jul 5 2019, 3:26 PM

@fgiunchedi Please open a procurement task in that case.

Thanks.

fgiunchedi mentioned this in Unknown Object (Task).Jul 5 2019, 3:31 PM

Mentioned in SAL (#wikimedia-operations) [2019-07-31T14:57:50Z] <godog> ms-be2018 disablepd 1I:1:1 - T225630

fgiunchedi closed this task as Resolved.Thu, Aug 1, 2:39 PM

Disk replaced and is rebuilding, thanks @Papaul