Page MenuHomePhabricator

codfw:expansion: Network devices/patch panel wiring
Closed, ResolvedPublic

Assigned To
Authored By
Papaul
Dec 16 2024, 12:37 AM
Referenced Files
F71710215: Fr-Tech Eqiad Feb 2026-Frack MGMT.png
Fri, Feb 6, 6:14 PM
Restricted File
Mar 26 2025, 4:20 AM
F58922230: codfw_expansion-patch_panel_detail.jpg
Mar 26 2025, 3:46 AM
F58217796: codfw_expansion-spines-leaves_connections.jpg
Jan 17 2025, 4:03 AM
F58022954: codfw_expansion-patch_cable_connections.jpg
Dec 16 2024, 12:42 AM
F58022948: codfw_expansion-patch_cable_connections.jpg
Dec 16 2024, 12:37 AM

Description

This task will track the wiring of network devices in DH5(spines,frack access switches, main management switch to the network devices in DH7(core routers, 4 spine switches, 2 frack firewalls and the management switch.

The diagram below outline the number and type of transceivers needed for the connections.

@cmooney when you have time can you please look at this and let me know if i am missing something or if we have to remove or add or modify something. Note: The frack switches are connected directly to the firewalls and have no connections to the other frack switches because according to Fundraising all the Fundraising Analytic servers will move to F5 and only analytic servers will be in that rack and no inter-vlan traffic between those servers and the other Fundraising servers.

@RobH the diagram outline the number and type of transceivers needed. I will provide the fiber information after our meeting with CY1 this coming Wednesday since i don;t know yet where the patch panel will land.

Thanks

codfw_expansion-patch_cable_connections.jpg (746×790 px, 97 KB)

Related Objects

StatusSubtypeAssignedTask
ResolvedPapaul

Event Timeline

Papaul updated the task description. (Show Details)

Thanks for the diagram @Papaul. Overall looks fine thanks.

FR-Tech

Probably worth catching up with the frack guys to go over the setup but based on your description the setup there looks like the best way to do it for sure. Frack changing from a single rack in each DC was not something I was aware of, that will have impact on T379553 although I'm sure nothing impossible. Might be worth having a call between all of us to discuss both issues.

We should probably use 25G SFP28 modules instead of the SFP+ links to the payment firewalls from the new switches though, and connect to et-0/1/0 and et-0/1/1 (like these). Or we could move the existing fr-tech switch uplinks to 25G (matching eqiad) and use 10G for these new fr-tech analytics switches. Let's discuss with the fundraising SREs, but either way here let's order the 25G.

We need 2x100G trunk link between the new frack switches, so make sure we order optics/DACs for that. But doesn't need to be on this diagram of connections back to the old cage.

Lastly is there a view on what switch vendor we'll use for the fr-tech cab(s)? I don't expect any issues if we went with Nokia but we weren't targeting any layer-2 based config for our Nokia automation so the sooner we confirm the better on that.

40G links?

Why do we have 40GBase-LR4 modules listed? Are we re-using some old switch for msw2-codfw? Be better if we were installing new equipment and used 100G links instead if that was possible.

New core router line card

@wiki_willy I don't see a task on procurement for network devices in the new cage yet so mentioning this here. To support the additional connectivity from the Spine switches in the new cage to our CR routers we will need to order a Juniper MPC10E-10C line-card and two new SCBE3-MX switch control boards. The new cards will go into cr1-codfw, and we will take the existing MPC7E from that router and put it in cr2-codfw to give us the extra ports needed there. It's planned out in the below sheet:

https://docs.google.com/spreadsheets/d/1-5fzirhBtlTSQetv6iWDyCQfHNAF0V9ca5nftDGzrv0

This was listed on the draft budget we prepped for the expansion here also:

https://docs.google.com/spreadsheets/d/1DxWMIDa7U-BxlqI2K8iKBWwtBrJOjR2J0AuTOsV4JPw

@cmooney thank you for the review.

For the fundraising rack i have a separate racking and installation task coming up where I will put all the details. Like you said. I was waiting for us to meet with the team before making that task.

"Why do we have 40FBase-LR4 modules listed?" Because the msw2 will connect to the msw1 which is just anEX4300 and don't have any 100G interfaces so if we go the path of having a new msw2 in the new cage that mean we will have to replace also the msw1 in the old cage.

@cmooney also just keep in mind that this task is to have an overview of how things will be connected. The type of optic and other elements will be based on the decision we make about going with Juniper or Nokia.

Thanks @Papaul for the feedback.

"Why do we have 40FBase-LR4 modules listed?" Because the msw2 will connect to the msw1 which is just anEX4300 and don't have any 100G interfaces so if we go the path of having a new msw2 in the new cage that mean we will have to replace also the msw1 in the old cage.

Ah yeah that makes perfect sense thanks.

@cmooney also just keep in mind that this task is to have an overview of how things we be need connected. The type of optic and other elements will be based on the decision we make about going with Juniper or Nokia.

Outside of the question of how the optics are coded (i.e. if we need vendor-specific coding for them) I think the cabling and optics should be the exact same whether we go with Juniper or Nokia. I suggested on the emails and task about the Nokia gear that we go ahead and order all the cables and optics for the new cage in advance, we should be able to use the same for either vendor.

Papaul triaged this task as Medium priority.Dec 19 2024, 2:24 PM
Papaul added a project: ops-codfw.

I am adding also here the spines/leaves connection diagram for reference.

codfw_expansion-spines-leaves_connections.jpg (611×901 px, 52 KB)

Papaul mentioned this in Unknown Object (Task).Jan 17 2025, 4:38 AM

Just coming back, I'm also curious about the upcoming FR-tech changes, is that discussed somewhere ?

Other than that, +1 on everything that has been said here, it lgtm !

@ayounsi welcome back. the only information we have right now is frack having 1 more rack in the new cage and it will have all the fundraising analytic nodes. I sent our the invite to fundraising team to join tomorrow's meeting to provide us with more information and possible ask all the questions we have.

On the other hand I have a question for @ayounsi and @cmooney
so we haven't decided on the network type we are going to use for the new cage. (EVEPN/VXLAN or flat BGP network) can we please discuss this topic also tomorrow and come up with a final decision so if we have to do any changes to the diagram, wiring and procurement we can do it soon. And this also help us to better have a good layout of things. Thank you

On the other hand I have a question for @ayounsi and @cmooney
so we haven't decided on the network type we are going to use for the new cage. (EVEPN/VXLAN or flat BGP network) can we please discuss this topic also tomorrow and come up with a final decision so if we have to do any changes to the diagram, wiring and procurement we can do it soon.

Either way we should not need to change the wiring diagram based on that.

patch panel detail connection diagram

{F58922428}

ports 23/24 and 25/26 on this diagram supposed to be connected to the new fasw's in rack F5 since we decided to move one of the firewall from x in DH7 to F5 DH5 we changed the port connections. Ports 23/24 and 25/26 are now used for the HA port for the firewalls. See link below
https://phabricator.wikimedia.org/T401297

Change #1144625 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/dns@master] Add INCLUDE files for new IPv6 addresses in use in codfw

https://gerrit.wikimedia.org/r/1144625

Change #1144625 merged by Cathal Mooney:

[operations/dns@master] Add INCLUDE files for new IPv6 addresses in use in codfw

https://gerrit.wikimedia.org/r/1144625

@Papaul @ayounsi a little heads up here I ended up reverting to my original plan for the fr-tech mgmt connectivity during the migration we just completed in eqiad.

Fr-Tech Eqiad Feb 2026-Frack MGMT.png (862×1 px, 90 KB)

I know as we had discussed previously we agreed to hang the MGMT vlan off the fasw switches. However I wasn't 100% sure how you intended this to work. The fasw switches in each rack terminate on a separate rethX interfaces on the firewalls. So we would need to have a separate mgmt vlan and subnet for each rack, with the gateway for one on reth0 and the gateway for the other on reth1.

I had not properly thought that out in advance of our move, and given we only got started on Thursday of our move-week I did not want to throw in the renumbering of half the hosts idrac IPs into the mix at a late stage. Plus we'd not communicated to the fr-tech SREs that any IPs would need to change so it seemed like a big last-minute thing to drop.

I'm happy whatever way we do this. We can copy the above when we get to codfw if that is workable (I know you aren't too fond of the cross-rack link Papaul). Otherwise we can redesign the eqiad side to have separate vlans and subnets hanging off the fasw's instead, and renumber one rack. Let me know what you think.

@cmooney thanks for bringing this up and finding the issue. I didn't really think about the mgmt vlan until now reading your comment. What you say about having the firewall to hang the MGMT vlan makes sense in this scenario. If we have to go this path that means we will have to have a long fiber run for 1G from rack x where the other fmsw is to rack A1 where the patch panel is in DH7 then run another fiber in DH5 from E3 where the patch panel is to F5 with the second fmsw is to connect both fmsw's . This s a quick fix for this issues no big changes on our side or frack side.
I did a tutorial similar to this where I setup irb interfaces on the core switches and keeping the gateway on the firewall and using it as the virtual-gateway-address on the irb interface. See link below and maybe this it too much complex (DON'T LOOK AT THE EVPN LOL). We can also discuss more during the next meeting. Thank you
https://papaulgigitech.com/index.php?title=Juniper_Collapsed_Spine_with_EVPN

Papaul mentioned this in Unknown Object (Task).Tue, Feb 10, 11:53 PM

CODFW expansion is complete so we can close this task.