This task will track the migration of the ps1 and ps2 to be replaced with new PDUs in rack [[ https://netbox.wikimedia.org/dcim/racks/16/ | B8-eqiad ]].
Each server & switch will need to have potential downtime scheduled, since this will be a live power change of the PDU towers.
These racks have a single tower for the old PDU (with and A and B side), with the new PDUs having independent A and B towers.
[] - schedule downtime for the entire list of switches and servers.
[] - Wire up one of the two towers, energize, and relocate power to it from existing/old pdu tower (now de-energized).
[] - confirm entire list of switches, routers, and servers have had their power restored from the new pdu tower
[] - Once new PDU tower is confirmed online, move on to next steps.
[] - Wire up remaining tower, energize, and relocate power to it from existing/old pdu tower (now de-energized).
[] - confirm entire list of switches, routers, and servers have had their power restored from the new pdu tower
[] - connect via serial / confirm serial connection works
[] - setup PDU following directions on https://wikitech.wikimedia.org/wiki/Platform-specific_documentation/ServerTech#Initial_Setup
[] - update PDU model in puppet per T233129.
== List of routers, switches, and servers ==
| device | role | SRE team coordination | recommended action during maintainance
| asw-b8-eqiad | asw | @ayounsi | ensure this doesn't go offline as it will take entire rack network offline
| ganeti1018 | ganeti host |#serviceops | needs to be emptied of VMs before |
| gerrit1001 | spare | | fine to do at anytime |
| cloudvirt1030 | hypervisor | #cloud-services-team | Lots of VMs, please handle with care.
| db1132 | m2 master |#dba | This host is m2 master which holds some internal services, ensure it doesn't go offline, if it does, there is an automatic failover via proxies.
| pc1008 | parsercache host | #dba | #dba to depool it |
| restbase1024 | restbase | #serviceops, #services | fine to do at anytime |
| an-master1002 | | #analytics| fine to do any time
| dbproxy1015 | db proxy |#dba | Not in use
| graphite1004 | | @fgiunchedi | no action needed, if power is lost and can't be restored quickly we'll switch to codfw |
| rdb1009 | redis master | #serviceops | this will need coordination? |
| notebook1003 | | | |
| db1119 | db host |#dba | #dba to depool it
| db1113 | db host |#dba | #dba to depool it
| cloudservices1003 | DNS | #cloud-services-team | fine to do at anytime
| mwmaint1002 | | | This is the primary mw maint system in eqiad, perhaps we should halt deployments during this time?
| labpuppetmaster1001 | spare | #cloud-services-team | Good to go. Host is being decommissioned.
| ores1004 | ORES | #serviceops | fine do to at any time |
| wtp1036 | parsoid | #serviceops | fine to do at any time |
| wtp1035 | parsoid | #serviceops | fine to do at any time |
| wtp1034 | parsoid | #serviceops | fine to do at any time |
| dumpsdata1001 | dumps data server | @arielglenn | coordinate please |
| analytics1063 | | #analytics| fine to do any time
| analytics1062 | | #analytics| fine to do any time
| analytics1061 | | #analytics| fine to do any time