Page MenuHomePhabricator

Eqiad: Fr-tech expansion
Closed, ResolvedPublic

Description

High-level task to manage the work to add new fr-tech rack in eqiad, configure network and make ready for the new servers.

This task will also include the outline of the scope of work and all hosts and network gear being moved.

This google sheet is our primary move/planning document.

Prep Checklist

Migration Date / Time

2026-02-05 @ 10:30AM Eastern / 15:30 UTC

Migration Day Cadence

  • power up the fasw2[ab]-e16 switches to ensure serial and mgmt works
  • netops and Fundraising check in before work starts.
  • first item to move will be pfw1b-eqiad, as it will land in E16 and we can test the router to E16 cross-connection.
  • move all other items with move priority 1
  • netops check in to setup the newly migrated firewalls and switches
  • migration of move priority 2 hosts
  • migration of move priority 3 hosts

Related Objects

StatusSubtypeAssignedTask
ResolvedVRiley-WMF
OpenVRiley-WMF
InvalidJclark-ctr

Event Timeline

cmooney triaged this task as Medium priority.
cmooney mentioned this in Unknown Object (Task).Sep 9 2025, 4:23 PM
cmooney added a subtask: Unknown Object (Task).
Jclark-ctr closed subtask Unknown Object (Task) as Resolved.Sep 17 2025, 10:45 AM
Jclark-ctr closed subtask Unknown Object (Task) as Resolved.Sep 30 2025, 12:19 PM
VRiley-WMF closed subtask Unknown Object (Task) as Resolved.Nov 26 2025, 11:29 PM

So to put more shape on this the two new racks for Fundraising are as follows:

Racks
Renamed Devices

We will rename some devices as follows:

Device Old NameDevice New Name
fasw1-c1a-eqiadfasw1-e15a-eqiad
fasw1-c1b-eqiadfasw1-e15b-eqiad
fmsw1-c1-eqiadfmsw1-e15-eqiad
Connections

The connections from the firewall devices largely remain the same, except the cable lengths need to be different given the new locations.

Serial Console Connections:

Device 1Front PortLogical IntDevice 2Front PortLogical IntCable TypeDesc
pfw1a-eqiadCONN/Ascs console ????N/ARJ45 patchSerial console access
pfw1b-eqiadCONN/Ascs console ????N/ARJ45 patchSerial console access
fasw2-e15a-eqiadCONN/Ascs console ????N/ARJ45 patchSerial console access
fasw2-e15b-eqiadCONN/Ascs console ????N/ARJ45 patchSerial console access
fasw2-e16a-eqiadCONN/Ascs console ????N/ARJ45 patchSerial console access
fasw2-e16b-eqiadCONN/Ascs console ????N/ARJ45 patchSerial console access

Firewall <-> Firewall links

Device 1Front PortLogical IntDevice 2Front PortLogical IntCable TypeDesc
pfw1a-eqiadHA 0N/Apfw1b-eqiadHA 0N/A1G SFP linkCluster control link #1
pfw1a-eqiadHA 1N/Apfw1b-eqiadHA 1N/A1G SFP linkCluster control link #2
pfw1a-eqiad20xe-0/2/2pfw1b-eqiad20xe-7/2/210G SFP+ linkCluster fabric link #1
pfw1a-eqiad21xe-0/2/3pfw1b-eqiad21xe-7/2/310G SFP+ linkCluster fabric link #2

WMF Mgmt Network Links E15:

Device 1Front PortLogical IntDevice 2Front PortLogical IntCable TypeDesc
pfw1a-eqiadMGMTfxp0msw-e15-eiqad(any free port)N/ARJ45 patchWMF Management network
fasw2-e15a-eqiadC0em0msw-e15-eqiad(any free port)N/ARJ45 patchWMF Management network
fasw2-e15b-eqiadC0em0msw-e15-eqiad(any free port)N/ARJ45 patchWMF Management network

WMF Mgmt Network Links E16:

Device 1Front PortLogical IntDevice 2Front PortLogical IntCable TypeDesc
pfw1b-eqiadMGMTfxp0msw-e16-eiqad(any free port)N/ARJ45 patchWMF Management network
fasw2-e16a-eqiadC0em0msw-e16-eqiad(any free port)N/ARJ45 patchWMF Management network
fasw2-e16b-eqiadC0em0msw-e16-eqiad(any free port)N/ARJ45 patchWMF Management network

Fr-tech management switch uplinks:

Device 1Front PortLogical IntDevice 2Front PortLogical IntCable TypeDesc
pfw1a-eqiad0ge-0/0/0fmsw-e15-eqiad(any free port)N/ARJ45 patchDownstream connectivity to fmsw-e15 #1 (reth2)
pfw1b-eqiad0ge-7/0/0fmsw-e15-eqiad(any free port)N/ARJ45 patchDownstream connectivity to fmsw-e15 #2 (reth2)
pfw1a-eqiad0ge-0/0/1fmsw-e16-eqiad(any free port)N/ARJ45 patchDownstream connectivity to fmsw-e16 #1 (reth3)
pfw1b-eqiad0ge-7/0/1fmsw-e16-eqiad(any free port)N/ARJ45 patchDownstream connectivity to fmsw-e16 #2 (reth3)

Fundraising switch links rack E15:

Device 1Front PortLogical IntDevice 2Front PortLogical IntCable TypeDesc
fasw2-e15a-eqiad54et-0/0/54fasw2-e15b-eqiad54et-0/0/540.5m 100G QSFP28 DACTrunk between E15 switches LAG port 1
fasw2-e15a-eqiad55et-0/0/55fasw2-e15b-eqiad55et-0/0/550.5m 100G QSFP28 DACTrunk between E15 switches LAG port 2
fasw2-e15a-eqiad47et-0/0/47pfw1a-eqiad17et-0/1/125G SFP28 linkfasw2-e15a firewall uplink (reth0)
fasw2-e15b-eqiad47et-0/0/47pfw1b-eqiad17et-7/1/125G SFP28 linkfasw2-e15b firewall uplink (reth0)

Fundraising switch links rack E16:

Device 1Front PortLogical IntDevice 2Front PortLogical IntCable TypeDesc
fasw2-e16a-eqiad54et-0/0/54fasw2-e16b-eqiad54et-0/0/540.5m 100G QSFP28 DACTrunk between E16 switches LAG port 1
fasw2-e16a-eqiad55et-0/0/55fasw2-e16b-eqiad55et-0/0/550.5m 100G QSFP28 DACTrunk between E16 switches LAG port 2
fasw2-e16a-eqiad47et-0/0/47pfw1a-eqiad16et-0/1/025G SFP28 linkfasw2-e16a firewall uplink (reth1)
fasw2-e16b-eqiad47et-0/0/47pfw1b-eqiad16et-7/1/025G SFP28 linkfasw2-e16b firewall uplink (reth1)

Core Router uplinks:

Device 1Front PortLogical IntDevice 2Front PortLogical IntCable TypeDesc
pfw1a-eqiad18xe-0/2/0cr1-eqiad3/1/7xe-3/1/710GBase-LR SFP+ single-mode patchcr1-eqiad uplink
pfw1b-eqiad18xe-7/2/0cr2-eqiad3/1/7xe-3/1/710GBase-LR SFP+ single-mode patchcr2-eqiad uplink

For the scs console server, I believe it would be the one located in F8, is that correct?

For the scs console server, I believe it would be the one located in F8, is that correct?

Yeah it's usually whatever is nearest so use whichever is easiest to run the links to.

Correction:

WMF Mgmt Network Links E15:

Device 1Front PortLogical IntDevice 2Front PortLogical IntCable TypeDesc
pfw1a-eqiadMGMTfxp0msw-e15-eiqad(any free port)N/ARJ45 patchWMF Management network
fasw2-e15a-eqiadC0em0msw-e15-eqiad(any free port)N/ARJ45 patchWMF Management network
fasw2-e15b-eqiadC0em0msw-e15-eqiad(any free port)N/ARJ45 patchWMF Management network

WMF Mgmt Network Links E16:

Device 1Front PortLogical IntDevice 2Front PortLogical IntCable TypeDesc
pfw1b-eqiadMGMTfxp0msw-e16-eiqad(any free port)N/ARJ45 patchWMF Management network
fasw2-e16a-eqiadC0em0msw-e16-eqiad(any free port)N/ARJ45 patchWMF Management network
fasw2-e16b-eqiadC0em0msw-e16-eqiad(any free port)N/ARJ45 patchWMF Management network

Where it reads msw-e15-eiqad we need to use msw-e7-eiqad and where it reads msw-e16-eiqad we need to use msw-e8-eqiad and add the (1) PDU mgmt from E15 and E16 back to msw-e7-eqiad and msw-e8-eqiad respectively.

RobH mentioned this in Unknown Object (Task).Jan 16 2026, 8:56 PM
RobH added a subtask: Unknown Object (Task).
Jclark-ctr closed subtask Unknown Object (Task) as Resolved.Jan 30 2026, 3:56 PM

I have moved equipment that is currently in E16 to the recommended locations according to the google doc

I've taken the comment T403035#11518969 and updated it with our meeting plan of moving mgmt links to E7/E8.

Google Sheet Netwoking Links Tab, has the updated info.

In terms of the moves here, what makes sense from the netops point of view is to tackle in this order:

  1. Move both pfw1 units
    1. Connect links to serial, mgmt and core routers
    2. Connect links to new fasw switches in rack E16
  2. Move both fasw from x to E15
    1. Connect links to serial, mgmt and pfw's
  3. Move fmsw-c1-eqiad from x to E15
    1. Connect uplink to fasw2-e15a-eqiad port 0 (1G SFP-T required)
  4. Begin moving hosts, starting with:
    1. frbast1002
    2. frmon1002
    3. frauth1002
    4. frpm1002
    5. frlog1002
    6. frav1003

The remaining hosts can be tackled in any order but the above are used for remote access and monitoring so it helps if they are the first to come online.

Change #1236779 had a related patch set uploaded (by Dwisehaupt; author: Dwisehaupt):

[operations/dns@master] frack: Update dns handles for frack colo work

https://gerrit.wikimedia.org/r/1236779

Change #1236779 merged by Dwisehaupt:

[operations/dns@master] frack: Update dns handles for frack colo work

https://gerrit.wikimedia.org/r/1236779

RobH updated the task description. (Show Details)

@VRiley-WMF I think there is a slight mix-up with frqueue1005 and frqueue1006. These are new unprovisioned servers racked in E15, however they will be part of the fundraising vlan and so will need to go into rack E16 with the other frqueue hosts instead.

Mentioned in SAL (#wikimedia-fundraising) [2026-02-05T14:47:59Z] <dwisehaupt> downtimes scheduled for frack eqiad hosts and cross colo replication in prep for rack expansion - T403035

Icinga downtime and Alertmanager silence (ID=5da72ec9-7626-47d2-bc98-a871f93d717e) set by cmooney@cumin1003 for 1 day, 0:00:00 on 3 host(s) and their services with reason: fundraising migration eqiad

fasw2-c1a-eqiad,fasw2-c1b-eqiad,pfw1-eqiad

Mentioned in SAL (#wikimedia-operations) [2026-02-05T15:25:57Z] <topranks> deactivate BGP session from cr2-eqiad to pfw1b-eqiad fundraising migration T403035

Mentioned in SAL (#wikimedia-operations) [2026-02-05T16:37:06Z] <topranks> deactivate BGP session from cr1-eqiad to pfw1a-eqiad fundraising migration T403035

Change #1237270 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] Network: data.yaml - rename frack-fundraising vlan

https://gerrit.wikimedia.org/r/1237270

Change #1237270 merged by Cathal Mooney:

[operations/puppet@production] Network: data.yaml - rename frack-fundraising vlan

https://gerrit.wikimedia.org/r/1237270

Change #1237300 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] Change name and parent for fasw2-c1x-eqiad switches, moved to rack e15

https://gerrit.wikimedia.org/r/1237300

Change #1237300 merged by Cathal Mooney:

[operations/puppet@production] Change name and parent for fasw2-c1x-eqiad switches, moved to rack e15

https://gerrit.wikimedia.org/r/1237300

Change #1237301 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] Fundraising move: add new fasw devices in rack e16 to common.yaml

https://gerrit.wikimedia.org/r/1237301

Change #1237301 merged by Cathal Mooney:

[operations/puppet@production] Fundraising move: add new fasw devices in rack e16 to common.yaml

https://gerrit.wikimedia.org/r/1237301

Icinga downtime and Alertmanager silence (ID=785b501b-5e53-43b0-b903-5d93372eb8e1) set by cmooney@cumin1003 for 1 day, 0:00:00 on 2 host(s) and their services with reason: fundraising migration eqiad

fasw2-e15a-eqiad,fasw2-e15b-eqiad

Mentioned in SAL (#wikimedia-operations) [2026-02-05T19:34:26Z] <cmooney@cumin1003> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "rename fmsw1-c1-eqiad to fmsw1-e15-eqiad - cmooney@cumin1003 - T403035"

Mentioned in SAL (#wikimedia-operations) [2026-02-05T19:34:31Z] <cmooney@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "rename fmsw1-c1-eqiad to fmsw1-e15-eqiad - cmooney@cumin1003 - T403035"

Mentioned in SAL (#wikimedia-operations) [2026-02-05T19:36:51Z] <cmooney@cumin1003> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "rename fmsw1-c1-eqiad to fmsw1-e15-eqiad - cmooney@cumin1003 - T403035"

Mentioned in SAL (#wikimedia-operations) [2026-02-05T19:36:56Z] <cmooney@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "rename fmsw1-c1-eqiad to fmsw1-e15-eqiad - cmooney@cumin1003 - T403035"

@VRiley-WMF great work on these moves yesterday, I don't think it could have gone much smoother tbh :)

In terms of the cable labels there are five network cables that need updating in Netbox. I left some notes on Google Sheet about those five, they are the ones without a tick in 'netbox updated'.

Thanks, I will take a look at this. However, checking everything else, the entire move has been completed. I will work on some of the other entires.

Change #1237952 had a related patch set uploaded (by Jgreen; author: Jgreen):

[operations/dns@master] Return fundraising traffic to eqiad now that rack migration is complete.

https://gerrit.wikimedia.org/r/1237952

Change #1237952 merged by Jgreen:

[operations/dns@master] Return fundraising traffic to eqiad now that rack migration is complete.

https://gerrit.wikimedia.org/r/1237952

Change #1238379 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/software/netbox-extras@master] FR-Tech Provision Script: add some checks to validate rack for vlan

https://gerrit.wikimedia.org/r/1238379

This has been completed