Page MenuHomePhabricator

Agree how to handle port-block speeds for QFX5120-48Y
Open, MediumPublic

Description

The Juniper QFX5120-48Y's 48 x SFP28 ports can run at 1, 10 or 25G, depending on what module is placed in them (SFP/SFP+/SFP28).

The port speeds have to be set in blocks of 4, however. i.e. To set port 0/0/1 up as 25G we need to also set 0/0/0, 0/0/2 and 0/0/3 to that speed.

This obviously creates limitations for us as we use this switches and connect hosts at different speeds. There are definitely changes to automation needed to configure the speeds in the chassis, and DC-Ops will need to understand the limitations to ensure server connections are compatible with the block of ports they are assigning.

In terms of Netbox:

  1. We should try to make the Netbox provisioning script reject an invalid port/speed combination
    • i.e. our interface automation should check the adjacent ports, and not allow ge-0/0/1 to be created if xe-0/0/0 exists.
  2. We should consider if there is anything else we should do in Netbox, to indicate port block speeds
    • We could create all 4 interfaces that make up a block when the first is created, leaving 3 'inactive', but with speed set.
      • We'd also want to change the code that detects a port with same name but different speed, to fail rather than rename.
      • Might help DC-Ops identify spare ports already configured for the desired speed
      • Either way we need to try to use blocks optimally, and not end up with numerous unused ports locked to any speed.

Just some ideas anyway, there are lots of options in terms of how we deal with this. Creating this ticket to track progress / facilitate discussion on how to deal with the problem.

Event Timeline

cmooney triaged this task as Medium priority.Mar 10 2022, 4:24 PM
cmooney created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 769729 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/software/homer/deploy@master] Modify wmf-netbox plugin to provide QFX5120-48Y port block speeds

https://gerrit.wikimedia.org/r/769729

Agreed!

i.e. our interface automation should check the adjacent ports, and not allow ge-0/0/1 to be created if xe-0/0/0 exists.

The other way around as well to prevent 1G ports spread:
For example: if ge-0/0/0 exists, and ge-0/0/3 is unused, automation should refuse to create ge-0/0/4 and instruct DCops to use ge-0/0/3 instead.

Probably writing Wikitech doc as well.

So I've been able to check the options here on the QFX5120 platform.

It is not possible to mix 10G and 25G SFP modules in the same block of 4 ports.

You can either have this (25G and 10G ports were connected back-to-back for the purpose of the test):

root@lsw1-e4-eqiad> show configuration chassis fpc 0 pic 0 port 0 | display set
set chassis fpc 0 pic 0 port 0 speed 25g

{master:0}
root@lsw1-e4-eqiad> show interfaces descriptions | match port 
et-0/0/0        up    up   25G Port 1
xe-0/0/1                   10G Port 1
xe-0/0/43       up    down 10G Port 2
et-0/0/47       up    up   25G Port 2

Or this:

root@lsw1-e4-eqiad> show configuration chassis fpc 0 pic 0 port 0 | display set

{master:0}
root@lsw1-e4-eqiad> show interfaces descriptions | match port               
et-0/0/0                   25G Port 1
xe-0/0/1        up    up   10G Port 1
xe-0/0/43       up    up   10G Port 2
et-0/0/47       up    down 25G Port 2

I guess the learnings are as follows:

  • Both 1G and 25G ports can be configured with the same configuration commands (no need to configure it using the "port-range X X channel-speed" type commands")
  • 10G speed is achieved by removing any configuration on the port for 1G or 25G
  • That syntax is also the one used on all our MX devices (both MX204 & MX480), so using it everywhere gives us the greatest potential to combine templates.

As such I would propose to go ahead with CR 769729 as originally planned, to set each block of 4 ports to 1/10/25G as required on the QFX devices, using the "port X speed Y" syntax.

Actually I should clarify, it *may* be possible to use the channel-speed syntax to configure the switch in blocks of 2, it allows you to do this:

root@cloudsw1-e4-eqiad> show configuration chassis fpc 0 | display set      
set chassis fpc 0 pic 0 port-range 0 1 channel-speed 25g
set chassis fpc 0 pic 0 port-range 2 3 channel-speed 10g

It does not allow individual-level port configuration in this way, however:

root@cloudsw1-e4-eqiad# show | compare 
[edit chassis fpc 0 pic 0]
+ port-range 0 0 {
+     channel-speed 25g;
+ }
+ port-range 1 1 {
+     channel-speed 10g;
+ }

{master:0}[edit chassis fpc 0 pic 0]
root@cloudsw1-e4-eqiad# commit check 
[edit chassis fpc 0 pic 0 port-range 0 0 channel-speed]
  'channel-speed 25g'
    Port range high (0) should be greater than port range low (0) for FPC:0, PIC:0
error: configuration check-out failed

I have my doubts as to whether the the "block of 2" syntax will actually work, but to confirm I'll ask John to move the cables around to test tomorrow.

So to confirm it the configuration detailed above does not work:

mooney@cloudsw1-e4-eqiad> show configuration chassis | display set 
set chassis fpc 0 pic 0 port-range 0 1 channel-speed 25g
set chassis fpc 0 pic 0 port-range 2 3 channel-speed 10g
set chassis fpc 0 pic 0 port-range 40 43 channel-speed 10g
set chassis fpc 0 pic 0 port-range 44 47 channel-speed 25g

{master:0}
cmooney@cloudsw1-e4-eqiad> show interfaces descriptions | match Port 
et-0/0/0        up    up   25G Port 1
xe-0/0/2                   10G Port 1
xe-0/0/43       up    down 10G Port 2
et-0/0/47       up    up   25G Port 2

{master:0}
cmooney@cloudsw1-e4-eqiad> configure 
Entering configuration mode

{master:0}[edit]
cmooney@cloudsw1-e4-eqiad# delete chassis fpc 0 pic 0 port-range 0 1 channel-speed 25g 

{master:0}[edit]
cmooney@cloudsw1-e4-eqiad# commit 
configuration check succeeds
commit complete

{master:0}[edit]
cmooney@cloudsw1-e4-eqiad# exit 
Exiting configuration mode

{master:0}
cmooney@cloudsw1-e4-eqiad> show interfaces descriptions | match Port    
et-0/0/0                   25G Port 1
xe-0/0/2        up    up   10G Port 1
xe-0/0/43       up    up   10G Port 2
et-0/0/47       up    down 25G Port 2

{master:0}
cmooney@cloudsw1-e4-eqiad>

So I think we are stuck working in "blocks of 4", and thus the "port-range X channel-speed Y" syntax isn't required for any combination we'll have.

As such I think the logical thing to do is to use the "port X speed Y" config everywhere, and my suggestion would be something similar to what is in the CR above.

@ayounsi I think based on the above we should proceed with https://gerrit.wikimedia.org/r/c/operations/software/homer/deploy/+/769729

Port-blocks on the switches are always configured in groups of 4, it is not possible to configure a mix of 10/25G within a single block of 4 using the "port-range" syntax. So I see no reason not to standardize on "port X speed Y", with config for the switches like this:

{% if netbox.device_plugin.port_block_speeds -%}
fpc 0 {
    replace: pic 0 {
        {% for block, speed in (netbox.device_plugin.port_block_speeds.items() | sort()) if speed != 10 %}
        port {{ block }} {
            speed {{ speed }}g;
        }
        {% endfor -%}
    }
}
{% endif -%}

That also matches the syntax used on MX480 linecards, although the logic behind is different (no blocks of 4 etc.). But it is at least consistent in terms of commands used.

Anxious to get this done so DC-Ops can deploy to E/F without having to involve us for manual changes.

Change 811314 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/software/netbox-extras@master] Add test in Netbox network report for port-block speeds on QFX5120

https://gerrit.wikimedia.org/r/811314

Change 811314 merged by jenkins-bot:

[operations/software/netbox-extras@master] Add test in Netbox network report for port-block speeds on QFX5120

https://gerrit.wikimedia.org/r/811314

Change 812376 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/software/netbox-extras@master] Add function to int_automation to validate QFX5120 port blocks

https://gerrit.wikimedia.org/r/812376

Change 840105 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Add section for PIC config of QFX5120-48Y port block speeds

https://gerrit.wikimedia.org/r/840105

Change 769729 merged by Cathal Mooney:

[operations/software/homer/deploy@master] Modify wmf-netbox plugin to provide QFX5120-48Y port block speeds

https://gerrit.wikimedia.org/r/769729

Change 840105 merged by jenkins-bot:

[operations/homer/public@master] Add section for PIC config of QFX5120-48Y port block speeds

https://gerrit.wikimedia.org/r/840105

Change 859576 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] Modify Homer config to ignore port speed warnings

https://gerrit.wikimedia.org/r/859576

Change 859576 merged by Cathal Mooney:

[operations/puppet@production] Modify Homer config to ignore port speed warnings

https://gerrit.wikimedia.org/r/859576

Change 812376 abandoned by Cathal Mooney:

[operations/software/netbox-extras@master] Add function to int_automation to validate QFX5120 port blocks

Reason:

Will create as a custom validator instead

https://gerrit.wikimedia.org/r/812376

Change 930264 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/software/netbox-extras@master] Validate port block speed combo in server provision script for QFX5120

https://gerrit.wikimedia.org/r/930264

Change 944240 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/software/netbox-extras@master] Do not compare speed of disabled interfaces when validating blocks

https://gerrit.wikimedia.org/r/944240

Change 944240 merged by jenkins-bot:

[operations/software/netbox-extras@master] Do not compare speed of disabled interfaces when validating blocks

https://gerrit.wikimedia.org/r/944240

Change 985113 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/software/netbox-extras@master] Validators: enforce Trident3 port block consistency

https://gerrit.wikimedia.org/r/985113