cloudsw1-c8-eqiad and cloudsw1-d5-eqiad are running JunOS 18.4R2-S4.10.
Opening this task to track upgrading them to JunOS 20+ to bring them into line with the other cloudsw devices (which are on 20.2 and 20.4).
Plan will be to upgrade each switch one by one. The 'cloudsw2' devices in each of these racks are daisy-chained from the respective cloudsw1 device in the same rack. So when we upgrade each all hosts in that rack will be offline for the duration of the work. Connectivity to hosts in other racks should remain up throughout.
In total the upgrade of each device should be in the region of 20-30 minutes during which all hosts in the rack will suffer a complete network outage. So we should do it under a maintenance window, and depool, prep or otherwise do what is required to minimize the impact. We should make sure the active cloudnet and cloudgw hosts are manually switched in advance also.
The hosts that will be affected are as follows:
Rack C8 (also including hosts in row B which connect via this switch):
Rack D5 (done):
T371878: [network,D5] reboot cloudsw-d5
cloudvirts
We need to move the VMs running on the cloudvirts to other hypervisors, but we can't move all of them, so we should move only the ones that are sensitive, the rest should be able to come back once the network is restored.
List of VMs to move to a different rack:
TBD