This task will track the migration of the ps1 and ps2 to be replaced with new PDUs in rack [[ https://netbox.wikimedia.org/dcim/racks/3/ | A3-eqiad ]].
Downtime Window: 2019-07-23 @ 14:05 GMT. Expected window of 1.5 hours maximum. (first PDU swap took less than an hour.)
Each server & switch will need to have potential downtime scheduled, since this will be a live power change of the PDU towers.
These racks have a single tower for the old PDU (with and A and B side), with the new PDUs having independent A and B towers.
[x] - schedule downtime for the entire list of switches and servers.
[] - carefully unmount the existing PDU, KEEPING SYSTEMS PLUGGED IN AND POWERED ON UNLESS STATED OTHERWISE
[] - set PDU aside in rack, still energized, and remove old mounting brackets.
[] - install new mounting brackets, mount BOTH new PDU towers.
[] - Wire up the inner of the two towers, energize, and relocate power to it from existing/old pdu tower (now de-energized). Using the one closest to the servers first makes re-wiring power easier.
[] - confirm entire list of switches, routers, and servers have had their power restored from the new pdu tower
[] - Once new PDU tower is confirmed online, move on to next steps.
[] - Wire up remaining tower, energize, and relocate power to it from existing/old pdu tower (now de-energized).
[] - confirm entire list of switches, routers, and servers have had their power restored from the new pdu tower
== List of routers, switches, and servers ==
| device | role | SRE team coordination| notes
| asw2-a3-eqiad | asw | @ayounsi
| analytics1060 | analytics | #analytics
| analytics1059 | analytics | #analytics
| analytics1057 | analytics | #analytics
| analytics1056 | analytics | #analytics
| analytics1055 | analytics | #analytics
| analytics1054 | analytics | #analytics
| analytics1052 | analytics | #analytics
| elastic1031 | elastic | #discovery-search
| elastic1030 | elastic | #discovery-search
| logstash1010 | | #observability | ok with power loss, nice to have: disable es replication
| cloudservices1004 | | #cloud-services-team
| restbase1016 | | @fgiunchedi | ok with power loss
| kubernetes1001| kubernetes| #serviceops
| rdb1005
| restbase1019 | | @fgiunchedi | ok with power loss
| restbase1011 | | @fgiunchedi | ok with power loss
| restbase1010 | | @fgiunchedi | ok with power loss
| graphite1003 || #observability | ok with power loss, if prolonged down we'll failover to codfw
| relforge1001
| db1103 | db | #dba
| dbproxy1003 | dbproxy | #dba
| elastic1035 | elastic | #discovery-search
| elastic1034 | elastic | #discovery-search
| elastic1033 | elastic | #discovery-search
| elastic1032 | elastic | #discovery-search
| cp1008 | cp | #traffic
| dbstore1003 | dbstore | #Analytics
| prometheus1003 | | #observability | ok with power loss
| ganeti1007 | ganeti host | @akosiaris | host will need to be emptied in advance
| dbproxy1001 | dbproxy | #dba
| dbproxy1002 | dbproxy | #dba
| db1127 | db | #dba
| radium