This task will track the migration of the ps1 and ps2 to be replaced with new PDUs in rack A3-eqiad.
Downtime Window: 2019-07-23 @ 14:05 GMT. Expected window of 1.5 hours maximum. (first PDU swap took less than an hour.)
Each server & switch will need to have potential downtime scheduled, since this will be a live power change of the PDU towers.
These racks have a single tower for the old PDU (with and A and B side), with the new PDUs having independent A and B towers.
- - schedule downtime for the entire list of switches and servers.
- - carefully unmount the existing PDU, KEEPING SYSTEMS PLUGGED IN AND POWERED ON UNLESS STATED OTHERWISE
- - set PDU aside in rack, still energized, and remove old mounting brackets.
- - install new mounting brackets, mount BOTH new PDU towers.
- - Wire up the inner of the two towers, energize, and relocate power to it from existing/old pdu tower (now de-energized). Using the one closest to the servers first makes re-wiring power easier.
- - confirm entire list of switches, routers, and servers have had their power restored from the new pdu tower
- - Once new PDU tower is confirmed online, move on to next steps.
- - Wire up remaining tower, energize, and relocate power to it from existing/old pdu tower (now de-energized).
- - confirm entire list of switches, routers, and servers have had their power restored from the new pdu tower
- - issue with elastic1031, @Cmjohnson making followup task
List of routers, switches, and servers
device | role | SRE team coordination | notes |
asw2-a3-eqiad | asw | @ayounsi | |
analytics1060 | analytics | Analytics | |
analytics1059 | analytics | Analytics | |
analytics1057 | analytics | Analytics | |
analytics1056 | analytics | Analytics | |
analytics1055 | analytics | Analytics | |
analytics1054 | analytics | Analytics | |
analytics1052 | analytics | Analytics | |
elastic1031 | elastic | Discovery-Search | |
elastic1030 | elastic | Discovery-Search | |
logstash1010 | observability | ok with power loss, nice to have: disable es replication | |
cloudservices1004 | cloud-services-team | ||
restbase1016 | @fgiunchedi | ok with power loss | |
kubernetes1001 | kubernetes | serviceops | |
rdb1005 | misc redis | serviceops | ok with powerloss |
restbase1019 | @fgiunchedi | ok with power loss | |
restbase1011 | @fgiunchedi | ok with power loss | |
restbase1010 | @fgiunchedi | ok with power loss | |
graphite1003 | awaiting decom | ||
relforge1001 | |||
db1103 | db | DBA | |
dbproxy1003 | dbproxy | DBA | |
elastic1035 | elastic | Discovery-Search | |
elastic1034 | elastic | Discovery-Search | |
elastic1033 | elastic | Discovery-Search | |
elastic1032 | elastic | Discovery-Search | |
cp1008 | cp | Traffic | |
dbstore1003 | dbstore | Analytics | |
prometheus1003 | observability | ok with power loss | |
ganeti1007 | ganeti host | @akosiaris | host will need to be emptied in advance |
dbproxy1001 | dbproxy | DBA | |
dbproxy1002 | dbproxy | DBA | |
db1127 | db | DBA | |
radium | |||