Page MenuHomePhabricator

Move lvs2014 link to row A and connect to new row A/B vlans
Closed, ResolvedPublic

Description

In order to support connecting servers to the new, per-rack vlans in rows A and B, we need to move lvs2014's connection to row A so it terminates on the Spine switch in rack A8. Once moved it will be able to reach all row A and B vlans over this link, and we can decom its link to row B.

This change will require co-ordination between DC-Ops, Netops and Traffic teams.

Detailed steps are as follows:

1. Pre-configure the new switch port for the connection.

Netops will take care of this. Port xe-0/0/32 (SFP port on back of switch) needs to be enabled in Netbox and have all the row A and B vlans configured as tagged out on it (no native/untagged vlan).

2. Downtime lvs2014

3. Connect lvs2014 interface eno12409np1 to ssw1-a8-codfw xe-0/0/32

This is the second port of the first 10G NIC in lvs2014, currently connected via single-mode fiber to asw-a4-codfw xe-4/0/47.

It probably makes most sense to re-use the existing fiber and optic, but I will leave it to DC-Ops to make that call. The new termination is in rack A8 so a little further along.

4. Remove cable and optics from lvs2014 interface enp152s0f0np0 to asw-b4-codfw xe-4/0/47

This is the first port on the second 10G NIC on the host. We can remove this link as we will trunk the row B vlans over eno12409np1 instead.

5. Merge patch to move / add vlan sub-interfaces to lvs2014 eno12409np1

The below patch contains the necessary changes and additions to connect the LB to the required vlans over eno12409np1.

https://gerrit.wikimedia.org/r/c/operations/puppet/+/980409

6. Push new network configuration to lvs2014

I believe once the above patch is merged we can run puppet on the host, then reboot, for it to take effect. Once the system is back up we should check connectivity on all the vlans by pinging the first IP in each subnet.

7. Default old asw-[a|b]-codfw ports that had been connected to lvs2014

Remove these ports in Netbox and run Homer to apply the changes to the legacy switches (netops).

8. Remove downtime for host

Event Timeline

cmooney triaged this task as Medium priority.

Change 980409 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] Add new codfw per-rack vlans to lvs2014 and move row B vlans

https://gerrit.wikimedia.org/r/980409

Mentioned in SAL (#wikimedia-operations) [2024-01-10T14:54:29Z] <topranks> adding vlans to ssw1-a8-codfw to trunk to lvs2014 T352758

Mentioned in SAL (#wikimedia-operations) [2024-01-10T15:01:52Z] <sukhe> disable puppet and stop pybal on lvs2014: T352758

Mentioned in SAL (#wikimedia-operations) [2024-01-10T15:04:07Z] <sukhe@cumin2002> START - Cookbook sre.hosts.downtime for 3:00:00 on lvs2014.codfw.wmnet with reason: T352758

Mentioned in SAL (#wikimedia-operations) [2024-01-10T15:04:23Z] <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on lvs2014.codfw.wmnet with reason: T352758

Change 989534 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/dns@master] depool codfw: do not merge! emergency depool patch

https://gerrit.wikimedia.org/r/989534

Change 980409 merged by Cathal Mooney:

[operations/puppet@production] Add new codfw per-rack vlans to lvs2014 and move row B vlans

https://gerrit.wikimedia.org/r/980409

Change 989566 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] Add new codfw row a/b per-rack vlans to hieradata for lvs

https://gerrit.wikimedia.org/r/989566

Change 989566 merged by Cathal Mooney:

[operations/puppet@production] Add new codfw row a/b per-rack vlans to hieradata for lvs

https://gerrit.wikimedia.org/r/989566

Change 989585 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] hiera: temporarily set bgp-med to 101 for lvs2013

https://gerrit.wikimedia.org/r/989585

Change 989585 merged by Ssingh:

[operations/puppet@production] hiera: temporarily set bgp-med to 101 for lvs2013

https://gerrit.wikimedia.org/r/989585

All work completed on this, lvs2014 made active for several hours and no issues.

Change 989534 abandoned by Ssingh:

[operations/dns@master] depool codfw: do not merge! emergency depool patch

Reason:

this wasn't required, thankfully

https://gerrit.wikimedia.org/r/989534