Page MenuHomePhabricator

Check bast2001 for hardware problems
Closed, ResolvedPublic


Daniel reinstalled bast2001 and it took a very long time:

(from SAL:)
22:53 mutante: bast2001 - powercycle, reinstall
23:20 mutante: bast2001 - install issues - extending downtime, bbiaw
06:26 mutante: bast2001 - still installing in snail mode - please feel free to check if it's done, and if so re-add to puppet so users get created.thx

Something seems broken with the hardware. Papaul, please run hardware diagnostics before we bring that box back up. Also the serial console is dead, "console com2" returns immediately.

Event Timeline

Restricted Application added a project: Operations. · View Herald TranscriptMar 9 2016, 8:31 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Sshing over there lands me in busybox, dmesg shows a lot of

[14686.650278] sd 0:0:0:0: [sda] Unhandled sense code
[14686.650287] sd 0:0:0:0: [sda]
[14686.650290] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[14686.650293] sd 0:0:0:0: [sda]
[14686.650295] Sense Key : Medium Error [current]
[14686.650299] Info fld=0x6ae878
[14686.650302] sd 0:0:0:0: [sda]
[14686.650305] Add. Sense: Unrecovered read error
[14686.650308] sd 0:0:0:0: [sda] CDB:
[14686.650309] Read(10): 28 00 00 6a e8 78 00 00 08 00
[14686.650320] end_request: critical medium error, dev sda, sector 7006328
[14688.501578] sd 0:0:0:0: [sda] Unhandled sense code
[14688.501591] sd 0:0:0:0: [sda]

Papaul added a comment.Mar 9 2016, 9:01 PM

I did a full hardware scan, HD0 is bad

. The system is out of warranty (HW warranty expiration: 2015-08-29). so i will have to open a task to order a new disk for this system.

Papaul triaged this task as Normal priority.Mar 9 2016, 9:01 PM
RobH edited subtasks, added: Unknown Object (Task); removed: T129405: codfw: 500GB SATA disk for bast2001.Mar 9 2016, 10:47 PM
Gehel added a subscriber: Gehel.Mar 15 2016, 1:24 PM

It's been 3 weeks now — can we get an update on why this is taking so long? Rumour is that we're waiting for some disk shipment or something, but I don't see any updates or blocked tasks here. @Papaul/@RobH, could you please update this task to reflect progress and then we can discuss on how to speed this up (and others like it, e.g. with a spare pool disk purchase?).

Dzahn added a comment.EditedMar 31 2016, 1:34 PM

@faidon the blocked ticket and updates are T129410

there it says "Scheduled Delivery Updated To:

Thursday, 03/31/2016, By End of Day"

Papaul reassigned this task from Papaul to Dzahn.Mar 31 2016, 7:55 PM
Papaul added a subscriber: Papaul.

@Dzahn disk replacement complete.

Papaul closed subtask Unknown Object (Task) as Resolved.Mar 31 2016, 7:57 PM
Dzahn closed this task as Resolved.EditedMar 31 2016, 9:23 PM

thank you @Papaul ! it works again.

13:39 < mutante> mdadm: /dev/md/0 has been started with 2 drives.
13:39 < mutante> mdadm: /dev/md/1 has been started with 2 drives.