Page MenuHomePhabricator

Configure QoS marking and policy across network
Closed, ResolvedPublic

Description

Creating this task to track progress to bringing the QoS configuration across the network live.

The broad plan should be as follows:

  • Modify CR border-in firewall filter to mark all external traffic with DSCP default bits
    • This protects against external traffic incorrectly being mapped into high-priority class later
  • Merge puppet patch to add ferm rule setting DSCP bits to default on all outbound server traffic
    • Same protection but for traffic entering network from our own servers
  • Merge Homer patch to enable CoS classifiers, schedulers on specific devices
    • Control if devices get the config with a new global var, temp until it's enabled on all
  • Apply config to certain devices (suggest ulsfo CRs and switches) and monitor
  • Remove the var controlling what devices get policy applied, and push to all network devices

With that done all network devices will have the QoS configuration in place, and all traffic will be in our "normal priority" forwarding class (with the exception of packets the routers create themselves, which will end up in "mgmt_control" FC, which is the same as currently except it's been renamed to that from "network-control"). Ultimately this should represent no change from current behaviour, just 2 additional forwarding classes (high and low) are defined but with no traffic mapped to them.

Once that is done we can begin to use the new puppet functions to map traffic we wish into any of the 3 non-default classes (mgmt/high/low). Decisions on what traffic should be mapped to what class is a separate issue we can discuss on a case-by-case basis once this task is complete.

Event Timeline

cmooney triaged this task as Medium priority.Jun 19 2023, 12:17 PM
cmooney created this task.

Change 931262 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Update border-in firewall filter to set DSCP bits to DE

https://gerrit.wikimedia.org/r/931262

Change 931263 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] Add ferm rule to mark all server traffic as DSCP 0

https://gerrit.wikimedia.org/r/931263

Change 931691 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Juniper class-of-service config and updated border-in filter for QoS

https://gerrit.wikimedia.org/r/931691

Change 931263 abandoned by Cathal Mooney:

[operations/puppet@production] Add ferm rule to mark all server traffic as DSCP 0

Reason:

reworking

https://gerrit.wikimedia.org/r/931263

Change 1007437 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] WIP: Add DSCP marking options to current firewall classes

https://gerrit.wikimedia.org/r/1007437

Change #1049554 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/software/homer/deploy@master] Add function to wmf-netbox plugin to provide QoS config data

https://gerrit.wikimedia.org/r/1049554

Change #1049566 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Re-mark external traffic to DSCP BE on CRs, and rename fwd classes

https://gerrit.wikimedia.org/r/1049566

Change #931691 abandoned by Cathal Mooney:

[operations/homer/public@master] Juniper class-of-service config and updated border-in filter for QoS

Reason:

https://gerrit.wikimedia.org/r/931691

Change #1049566 merged by jenkins-bot:

[operations/homer/public@master] Re-mark external traffic to DSCP BE on CRs, and rename fwd classes

https://gerrit.wikimedia.org/r/1049566

Change #1049917 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Add class-of-service scheduler and classifiers plus var to control

https://gerrit.wikimedia.org/r/1049917

Change #1050071 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Do add QoS configuration for fasw switches

https://gerrit.wikimedia.org/r/1050071

Change #1050071 merged by jenkins-bot:

[operations/homer/public@master] Do add QoS configuration for fasw switches

https://gerrit.wikimedia.org/r/1050071

Mentioned in SAL (#wikimedia-sre) [2024-07-03T09:06:51Z] <topranks> merge host firewall changes to set default DSCP marking (T339850)

Change #1007437 merged by Cathal Mooney:

[operations/puppet@production] Add DSCP marking options to current firewall classes

https://gerrit.wikimedia.org/r/1007437

Change #1049917 merged by jenkins-bot:

[operations/homer/public@master] Add class-of-service scheduler and classifiers plus var to control

https://gerrit.wikimedia.org/r/1049917

Change #1052167 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Add class-of-service scheduler and classifiers plus var to control

https://gerrit.wikimedia.org/r/1052167

Change #1049554 merged by Cathal Mooney:

[operations/software/homer/deploy@master] Add function to wmf-netbox plugin to provide QoS config data

https://gerrit.wikimedia.org/r/1049554

Change #1052167 merged by jenkins-bot:

[operations/homer/public@master] Add class-of-service scheduler and classifiers plus var to control

https://gerrit.wikimedia.org/r/1052167

Mentioned in SAL (#wikimedia-operations) [2024-08-28T13:04:33Z] <topranks> rolling out config additions of qos schedulers and policers to all network devices T339850

Mentioned in SAL (#wikimedia-operations) [2024-08-28T16:51:03Z] <topranks> add qos config to management firewalls T339850

Change #1068050 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Apply qos interface config in ulsfo and on lsw1-c6-codfw

https://gerrit.wikimedia.org/r/1068050

Change #1068050 merged by jenkins-bot:

[operations/homer/public@master] Apply qos interface config in ulsfo and on lsw1-c6-codfw

https://gerrit.wikimedia.org/r/1068050

Mentioned in SAL (#wikimedia-operations) [2024-08-29T09:24:55Z] <topranks> apply qos classifers and scedulers to interfaces on asw2-ulsfo T339850

Mentioned in SAL (#wikimedia-operations) [2024-08-29T09:58:41Z] <topranks> apply qos classifers and scedulers to interfaces on ulsfo CRs T339850

Mentioned in SAL (#wikimedia-operations) [2024-08-29T11:41:44Z] <topranks> modify qos configuration for asw2-ulsfo xe-2/0/18 (ganeti4006) to add traffic-control-profile T339850

Mentioned in SAL (#wikimedia-operations) [2024-08-29T13:50:58Z] <topranks> add qos interface schedulers on lsw1-d4-codfw T339850

Mentioned in SAL (#wikimedia-operations) [2024-09-02T12:58:52Z] <topranks> apply qos classifers and schedulers to server interfaces on asw-d-codfw T339850

Mentioned in SAL (#wikimedia-operations) [2024-09-05T14:11:41Z] <topranks> add interface qos schedulers on cr1-codfw T339850

Change #1070983 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Enable qos scheduling on cr1-codfw interfaces

https://gerrit.wikimedia.org/r/1070983

Change #1070983 merged by jenkins-bot:

[operations/homer/public@master] Enable qos scheduling on cr1-codfw interfaces

https://gerrit.wikimedia.org/r/1070983

Change #1070989 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Missing 'k' on shaper rate

https://gerrit.wikimedia.org/r/1070989

Change #1070989 merged by jenkins-bot:

[operations/homer/public@master] Missing 'k' on shaper rate

https://gerrit.wikimedia.org/r/1070989

Mentioned in SAL (#wikimedia-operations) [2024-09-05T18:06:28Z] <topranks> add interface qos scheduler config to remaining CRs T339850

Mentioned in SAL (#wikimedia-operations) [2024-09-05T19:21:25Z] <topranks> add interface qos scheduler config to codfw switches T339850

Mentioned in SAL (#wikimedia-operations) [2024-09-05T20:41:26Z] <topranks> add interface qos scheduler config to eqiad switches T339850

Mentioned in SAL (#wikimedia-operations) [2024-09-05T21:15:02Z] <topranks> add interface qos scheduler config to remaining switches T339850

Change #1071045 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] Remove variable controlling what devices have interface qos added

https://gerrit.wikimedia.org/r/1071045

Change #1071045 merged by jenkins-bot:

[operations/homer/public@master] Remove variable controlling what devices have interface qos added

https://gerrit.wikimedia.org/r/1071045

cmooney updated the task description. (Show Details)