Page MenuHomePhabricator

Investigate strontium disk issues on 2016-08-05
Closed, ResolvedPublic

Description

Today after fully depooling strontium, I 've rebooted it and tried to reimage it with jessie. The process failed completely. Messages in debian-installer's syslog were

[  198.806945] sd 0:0:0:0: [sda] READ CAPACITY(16) failed
[  198.806950] sd 0:0:0:0: [sda]  
[  198.806953] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  198.806955] sd 0:0:0:0: [sda]  
[  198.806957] Sense Key : Not Ready [current] 
[  198.806960] sd 0:0:0:0: [sda]  
[  198.806962] Add. Sense: Logical unit not ready, cause not reportable
[  198.813436] sd 0:0:1:0: [sdb] READ CAPACITY(16) failed
[  198.813441] sd 0:0:1:0: [sdb]  
[  198.813443] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  198.813445] sd 0:0:1:0: [sdb]  
[  198.813447] Sense Key : Not Ready [current] 
[  198.813450] sd 0:0:1:0: [sdb]  
[  198.813452] Add. Sense: Logical unit not ready, cause not reportable
[  198.819903] sd 0:0:0:0: [sda] READ CAPACITY failed
[  198.819908] sd 0:0:0:0: [sda]  
[  198.819910] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  198.819912] sd 0:0:0:0: [sda]  
[  198.819914] Sense Key : Not Ready [current] 
[  198.819917] sd 0:0:0:0: [sda]  
[  198.819919] Add. Sense: Logical unit not ready, cause not reportable
[  198.824394] sd 0:0:1:0: [sdb] READ CAPACITY failed
[  198.824399] sd 0:0:1:0: [sdb]  
[  198.824401] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  198.824403] sd 0:0:1:0: [sdb]  
[  198.824405] Sense Key : Not Ready [current] 
[  198.824408] sd 0:0:1:0: [sdb]

so either the controller or both (????) disks are dead.

I 've started the Controller config app via hitting Ctrl+c while BIOS is POSTing and while it looks like it is working fine, I could not do anything noteworthy. Probably both disks are dead ?

@Cmjohnson any chance you can help diagnose this ? It's probably disk replacements that are going to be required but not 100% sure.

Event Timeline

Restricted Application added subscribers: Southparkfan, Aklapper. · View Herald TranscriptAug 5 2016, 10:54 AM

Mentioned in SAL [2016-08-05T13:56:38Z] <akosiaris> strontium has issues, see https://phabricator.wikimedia.org/T142187

Cmjohnson added a subscriber: RobH.Aug 8 2016, 1:11 PM

The 2 disks are most likely failed. The server is from the original build and should be decommissioned and a new misc server allocated to replace the server. Linking this task to the procurement task for @RobH.

Cmjohnson triaged this task as Medium priority.Oct 11 2016, 4:17 PM

@akosiaris What do you want to do about this server?

@Cmjohnson Let's just decommision it

Cmjohnson closed this task as Resolved.Nov 17 2016, 4:48 PM

Resolving this task. This server is to be decom'd