Yesterday during T223126: Install new PDUs into b5-eqiad ms-be1033 powered off at the beginning of the window (since 2019-05-16 12:54:56 according to icinga) and after work was completed it couldn't be powered back on by @Cmjohnson . Filing a task for tracking on further diagnosis and next steps.
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
eqiad-prod: start depool ms-be1033 | operations/software/swift-ring | master | +1 K -1 K |
Related Objects
Event Timeline
Since the host is not coming back for another week for sure I'm going to de-weight in swift
Change 511670 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/software/swift-ring@master] eqiad-prod: start depool ms-be1033
Mentioned in SAL (#wikimedia-operations) [2019-05-21T13:25:44Z] <godog> swift eqiad-prod: start depool ms-be1033 - T223518
Change 511670 merged by Filippo Giunchedi:
[operations/software/swift-ring@master] eqiad-prod: start depool ms-be1033
Mentioned in SAL (#wikimedia-operations) [2019-05-23T13:00:26Z] <godog> swift eqiad-prod: ms-be1033 weight to 1500 - T223518
Mentioned in SAL (#wikimedia-operations) [2019-05-27T13:02:58Z] <godog> swift eqiad-prod: ms-be1033 weight to 0 - T223518
Steps i have taken
- I took the server down to the bare minimum operating condition 1CPU and 1DIMM and the server will still not boot. I created a support ticket with HP.
5338974069
Mentioned in SAL (#wikimedia-operations) [2019-06-08T11:58:20Z] <godog> stop swift processes on ms-be1033 - T223518
Mentioned in SAL (#wikimedia-operations) [2019-06-11T10:54:13Z] <godog> wipe fs on ms-be1033 data partitions - T223518
Mentioned in SAL (#wikimedia-operations) [2019-06-11T12:54:13Z] <godog> swift eqiad-prod: put back ms-be1033 - T223518
Mentioned in SAL (#wikimedia-operations) [2019-06-12T11:55:43Z] <godog> swift eqiad-prod: put back ms-be1033 - T223518
Mentioned in SAL (#wikimedia-operations) [2019-06-25T12:48:40Z] <godog> swift eqiad-prod: put back ms-be1033 - T223518
Mentioned in SAL (#wikimedia-operations) [2019-07-01T09:54:28Z] <godog> swift eqiad-prod eqiad-prod: put back ms-be1033 - T223518
The last rebalance is underway now to put ms-be1033 fully back in service. Resolving.