Due to delayed delivery times, and then an issue with licensing, lsw1-e1-eqiad and lsw1-f1-eqiad are currently acting as the aggregation / Spine switch devices for Eqiad rows E and F. In simple terms that means they are connecting to the CR routers in the other cage upstream, and to the remaining top-of-rack switches in racks E2, E3, F2 and F3.
This setup got us around the immediate issue of capacity, but we now have the Spine devices prepped, and need to move to them to standardize our topology and ensure we have an aggregation layer that can scale up to support the remaining racks in the new cage.
#### Migration Plan
The current device cabling is as follows:
{F35754899}
##### Step 1 - Bring Spines into fabric
Step 1 is to use the currently free QSFP28 ports on lsw1-e1-eqiad and lsw1-f1-eqiad to connect each of them to the new spine layer, enabling OSPF and BGP EVPN to bring them into the fabric.
{F35754939}
NOTE: This approach requires the purchase of 8 x [[https://www.fs.com/products/71644.html?attribute=159&id=566312 | 100G-Base CWDM4]] optic modules, and 4 x LC-LC single-mode fiber optic patch cables. When the migration is complete we will free up the same number of elements, which can be used to connect some of the next 8 racks when we make them live.
##### Step 2 - Migrate CR uplinks to Spines
Step 2 is to move the uplinks to the CR routers from where they land now, on the Leaf devices in racks E1/F1, to the Spine devices in those racks. This should be possible without interruption, provided we move the links one at a time and test everything at each step.
{F35755000}
###### Step 3 - Move remaining rack uplinks to Spines
This is really a multi-step operation. Basically the uplinks from leaf devices in racks E2, E3, F2 and F3 need to be moved from where they currently land (leaf switches in racks E1/F1) to the Spine switches in E1/F1.
Given the network topology/design they can be disabled one by one, moved to Spine, then the other uplink moved once traffic is flowing via the spine. OSPF costs can be adjusted temporarily to make this safe and ensure we validate links before they see real traffic.
{F35755084}
As a final task at this step we should move the uplinks from lsw1-e1-eqiad and lsw1-f1-eqiad towards ssw1-e1-eqiad from port 51 to 54, to keep numbering consistent. We can also remove the direct link (on port et-0/0/52 either side) between these switches.
##### Step 4 - LVS Migration
Lastly we need to move the links from the 4 LVS load-balancers in rows A-D (lvs1017, lvs1018, lvs1019, lvs1020) from where they land on the Leaf devices in E1/F1 to the Spine layer.
NOTE: In theory this can happen at step 3 instead, and we have slightly more optimal routing during the transition. But I think it is probably easier to process this as the final step in terms of the overall migration.
There are two main things to consider for this step:
**10G Termination**
One element we need to consider is how to terminate the 10G-Base-LR connections from these servers on the QSFP28 ports in the Spine devices.
The best way to proceed, it seems to me, is to use [[https://www.fs.com/de-en/products/36174.html?attribute=140&id=278755 | 4X10GE-LR QSFP+ modules]] on the Spines, which run as 4 individual 10G links, and use [[https://www.fs.com/de-en/products/68018.html | breakout cables]] to connect to the existing WMF-managed patch panels in the same racks.
**Optic Redundancy**
Provided we do that the next question is whether we should land 2 LVS connections onto a single module on either switch or not. For instance the links from lvs1017 and lvs1018, currently landing on separate 10G ports of lsw1-e1, could both terminate into the same QSFP+ optic when they are moved to ssw1-e1. That obviously saves money and Spine ports, but potentially it is not a good idea in the case that the single QSFP+ module fails. We might be better using two QSFP+ modules and 2 ports for redundancy. This point needs to be discussed with traffic I expect.