Page MenuHomePhabricator

Decom lsw1-a1-codfw
Closed, ResolvedPublic

Description

It has been decided that codfw rack A1 should be a "network" rack, containing core network devices like cr1-codfw and ssw1-a1-codfw, but not containing any servers. As such we do not need a 'leaf' switch in this rack, so the plan is to decom lsw1-a1-codfw and re-use this device for the upcoming codfw row c/d upgrade as lsw1-d1-codfw.

At a high level I believe we need to do the following:

  • Downtime lsw1-a1-codfw
  • Remove lsw1-a1-codfw from LibreNMS
  • Remove lsw1-a1-codfw from route-reflector config in codfw
  • Remove configuration from ssw1-a1-codfw and ssw1-a8-codfw ports connecting to lsw1-a1-codfw in Netbox
  • Push updated config to ssw1-a1-codfw and ssw1-a8-codfw to remove BGP peerings, OSPF and interfaces
  • Remove configuration of lsw1-a1-codfw interfaces (apart from em0 and loopbacks) in Netbox
  • Push updated config to lsw1-a1-codfw to remove it's ssw uplink config, BGP peerings etc.
  • Remove puppet references to lsw1-a1-codfw to remove from monitoring
  • Remove all references to private1-a1-codfw vlan from puppet (including lvs sub-interfaces)
  • Remove all references to private1-a1-codfw vlan and networks from netbox
  • Rename lsw1-a1-codfw to lsw1-d1-codfw in netbox
  • Update dns names on mgmt and loopback IPs to new hostname
  • Manually change hostname on device
  • Set lsw1-d1-codfw to 'planned' status in netbox
  • Remove lsw1-a1-codfw from devices.yaml in homer public repo
  • Remove all physical cabling from lsw1-a1-codfw, and optics from both sides of any links
  • Move to rack d1 and reconnect to mgmt network
  • Update rack location in Netbox

At this point the device should be ready to have new configuration elements (interfaces, IPs etc.) added for its new life in the other rack.

Related Objects

StatusSubtypeAssignedTask
OpenNone
ResolvedPapaul

Event Timeline

cmooney triaged this task as Medium priority.Fri, May 3, 10:25 AM
cmooney created this task.

Icinga downtime and Alertmanager silence (ID=b27eb80b-98ee-43fb-8026-b02b3e00b5d4) set by cmooney@cumin1002 for 14 days, 0:00:00 on 3 host(s) and their services with reason: device being decommed and renamed, downtiming as a precaution first

lsw1-a1-codfw,lsw1-a1-codfw IPv6,lsw1-a1-codfw.mgmt

Change #1026821 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] Remove entries for lsw1-a1-codfw and private1-a1-codfw

https://gerrit.wikimedia.org/r/1026821

Change #1026823 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Remove lsw1-a1-codfw from production

https://gerrit.wikimedia.org/r/1026823

Device has been removed from LiberNMS now. I also downtimed it for 2 weeks just in case I mess up the order of anything.

Change #1026823 abandoned by Cathal Mooney:

[operations/homer/public@master] Remove lsw1-a1-codfw from production

Reason:

need to remove from IBGP before deleting device itself

https://gerrit.wikimedia.org/r/1026823

Change #1026847 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Remove lsw1-a1-codfw from EVPN RR cluster config

https://gerrit.wikimedia.org/r/1026847

Mentioned in SAL (#wikimedia-operations) [2024-05-03T11:44:14Z] <topranks> Removing connections from ssw1-a1-codfw and ssw1-a8-codfw to lsw1-a1-codfw T364097

Change #1026847 merged by jenkins-bot:

[operations/homer/public@master] Remove lsw1-a1-codfw from EVPN RR cluster config

https://gerrit.wikimedia.org/r/1026847

Mentioned in SAL (#wikimedia-operations) [2024-05-03T12:06:33Z] <topranks> removing entries for lsw1-a1-codfw switch and private1-a1-codfw vlan from puppet T364097

Change #1026821 merged by Cathal Mooney:

[operations/puppet@production] Remove entries for lsw1-a1-codfw and private1-a1-codfw

https://gerrit.wikimedia.org/r/1026821

Change #1026911 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/dns@master] Remove include statement for old private1-a1-codfw range

https://gerrit.wikimedia.org/r/1026911

Change #1026911 merged by Cathal Mooney:

[operations/dns@master] Remove include statement for old private1-a1-codfw range

https://gerrit.wikimedia.org/r/1026911

cmooney updated the task description. (Show Details)

@Papaul I think this one is ready to be moved to rack D1 now.

Change #1026928 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Remove lsw1-a1-codfw from homer vars

https://gerrit.wikimedia.org/r/1026928

Change #1026928 merged by jenkins-bot:

[operations/homer/public@master] Remove lsw1-a1-codfw from homer vars

https://gerrit.wikimedia.org/r/1026928

Papaul updated the task description. (Show Details)