Page MenuHomePhabricator

ayounsi (Arzhel Younsi)
Staff Network SRE

Projects (10)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
Apr 3 2017, 6:23 PM (338 w, 6 d)
Availability
Available
IRC Nick
xionox
LDAP User
Ayounsi
MediaWiki User
AYounsi (WMF) [ Global Accounts ]

Recent Activity

Wed, Sep 27

ayounsi moved T347323: Add 4x10G breakout cable to cr2-esams from Backlog to Watching on the netops board.
Wed, Sep 27, 2:12 PM · SRE, netops, Infrastructure-Foundations
ayounsi added a comment to T347054: Simplify maintenance of DNS/NTP hosts to reduce toil around reboots, reimages, and other work.

Thanks, I opened T347494: Remove static routes for anycast prefixes to get rid of them. You can use 10.3.0.2/32 for the NTP VIP.

Wed, Sep 27, 2:12 PM · Patch-For-Review, SRE, Traffic
ayounsi added a comment to T347494: Remove static routes for anycast prefixes.

Yeah, actually you can use 10.3.0.2/32 for NTP, I won't go through renumbering the syslog VIP.

Wed, Sep 27, 2:12 PM · SRE, Infrastructure-Foundations, netops
ayounsi claimed T347494: Remove static routes for anycast prefixes.
Wed, Sep 27, 2:05 PM · SRE, Infrastructure-Foundations, netops
ayounsi created T347494: Remove static routes for anycast prefixes.
Wed, Sep 27, 2:04 PM · SRE, Infrastructure-Foundations, netops
ayounsi added a comment to T347054: Simplify maintenance of DNS/NTP hosts to reduce toil around reboots, reimages, and other work.

Thanks, as this VIP won't be critical we can skip the static routes and only allocate 10.3.0.8/32.

Wed, Sep 27, 1:51 PM · Patch-For-Review, SRE, Traffic
ayounsi added a comment to T332395: Upgrade asw1-eqsin.

FYI, the mgmt_junos bug (also present on the fasw) might not be fixed by an upgrade, but maybe with the solution exposed in https://www.reddit.com/r/Juniper/comments/mvq8hf/comment/j7gd6hq/
set interface em0.0 family inet address 10.XXX.XXX.XXX/XX master-only

Wed, Sep 27, 1:38 PM · SRE, netops, Infrastructure-Foundations
ayounsi added a comment to T326322: Add per-output queue monitoring for Juniper network devices.

To keep it somewhere for later, on Dell SONiC it should be on the /openconfig-qos:qos/interfaces path.
Grouping it by source/interface_interface-id/pfc-priority_dot1p and displaying it by "events" gNMI returns this:

{
  "name": "interfaces-states",
  "timestamp": 1695808554638790732,
  "tags": {
    "interface_interface-id": "Ethernet9",
    "pfc-priority_dot1p": "7",
    "source": "lsw1-e8-eqiad.mgmt.eqiad.wmnet:8080",
    "subscription-name": "interfaces-states"
  },
  "values": {
    "/openconfig-qos:qos/interfaces/interface/pfc/pfc-priorities/pfc-priority/config/dot1p": 7,
    "/openconfig-qos:qos/interfaces/interface/pfc/pfc-priorities/pfc-priority/config/enable": false,
    "/openconfig-qos:qos/interfaces/interface/pfc/pfc-priorities/pfc-priority/dot1p": 7,
    "/openconfig-qos:qos/interfaces/interface/pfc/pfc-priorities/pfc-priority/state/dot1p": 7,
    "/openconfig-qos:qos/interfaces/interface/pfc/pfc-priorities/pfc-priority/state/enable": false,
    "/openconfig-qos:qos/interfaces/interface/pfc/pfc-priorities/pfc-priority/state/statistics/pause-frames-rx": 0,
    "/openconfig-qos:qos/interfaces/interface/pfc/pfc-priorities/pfc-priority/state/statistics/pause-frames-tx": 0
  }
}
Wed, Sep 27, 1:29 PM · Patch-For-Review, SRE, Infrastructure-Foundations, netops
ayounsi added a comment to T347411: Drive host network config from Netbox, and move away from ifupdown.

It's great to see momentum on this recurring pain point!

Wed, Sep 27, 1:13 PM · Infrastructure-Foundations, SRE
ayounsi updated the task description for T347461: Build and package gnmic.
Wed, Sep 27, 8:35 AM · Packaging, Infrastructure-Foundations
ayounsi added a subtask for T326322: Add per-output queue monitoring for Juniper network devices: T347461: Build and package gnmic.
Wed, Sep 27, 8:35 AM · Patch-For-Review, SRE, Infrastructure-Foundations, netops
ayounsi added a parent task for T347461: Build and package gnmic: T326322: Add per-output queue monitoring for Juniper network devices.
Wed, Sep 27, 8:34 AM · Packaging, Infrastructure-Foundations
ayounsi created T347461: Build and package gnmic.
Wed, Sep 27, 8:34 AM · Packaging, Infrastructure-Foundations
ayounsi moved T332395: Upgrade asw1-eqsin from Backlog to Next quarter on the netops board.
Wed, Sep 27, 8:09 AM · SRE, netops, Infrastructure-Foundations
ayounsi moved T346779: cr1-esams:fpc0 errors from Backlog to This quarter on the netops board.
Wed, Sep 27, 8:08 AM · SRE, Infrastructure-Foundations, netops

Tue, Sep 26

ayounsi moved T306649: Agree strategy for Kubernetes BGP peering to top-of-rack switches from Watching to Next quarter on the netops board.
Tue, Sep 26, 2:54 PM · Patch-For-Review, serviceops, Prod-Kubernetes, SRE, Infrastructure-Foundations, netops
ayounsi closed T213843: Juniper network device audit - all sites as Resolved.

I think we can close that one. @RobH did the audit afaik.

Tue, Sep 26, 2:47 PM · Infrastructure-Foundations, DC-Ops, netops, SRE
ayounsi added a comment to T347323: Add 4x10G breakout cable to cr2-esams.

That's a great idea! Opened {T347403}

Tue, Sep 26, 2:17 PM · SRE, netops, Infrastructure-Foundations
ayounsi renamed T347323: Add 4x10G breakout cable to cr2-esams from Move cr1-esams<->cr2-esams link to QSFP port to Add 4x10G breakout cable to cr2-esams.
Tue, Sep 26, 1:38 PM · SRE, netops, Infrastructure-Foundations
ayounsi added a comment to T347323: Add 4x10G breakout cable to cr2-esams.

Thanks, I remembered there was a reason but forgot what it was!

Tue, Sep 26, 1:37 PM · SRE, netops, Infrastructure-Foundations
ayounsi added a comment to T347375: Netbox device location information not available on the first Puppet run of a device.

My understanding is that you're one step ahead of Prod here as you're deriving host networking based on Netbox data (eg. rack from vlan, etc) so you might catch new issues.
We should look at provisioning from beginning to end so we can mutualise the efforts here.

Tue, Sep 26, 12:15 PM · Infrastructure-Foundations, netbox, Cloud-VPS, cloud-services-team
ayounsi added a comment to T347312: Expose ethtool metrics to Prometheus.

We shouldn't alert on NIC saturation (or related counters) in the current state of things (unless we can redirect the alerts to the relevant teams). But we need to alert on errors caused by faulty NICs or faulty cables (anything L1) like we do for network devices.

Tue, Sep 26, 7:30 AM · Observability-Alerting, observability

Mon, Sep 25

ayounsi created T347323: Add 4x10G breakout cable to cr2-esams.
Mon, Sep 25, 4:30 PM · SRE, netops, Infrastructure-Foundations
ayounsi triaged T347312: Expose ethtool metrics to Prometheus as Low priority.
Mon, Sep 25, 3:47 PM · Observability-Alerting, observability
ayounsi edited P52598 (An Untitled Masterwork).
Mon, Sep 25, 10:22 AM
ayounsi created P52598 (An Untitled Masterwork).
Mon, Sep 25, 10:22 AM
ayounsi added a comment to T334916: Juniper RA receive bug CVE-2023-28981.

This might need to be rolled back the day we start doing BGP unnumbered between spine and leaf as it seems to rely on it: https://www.theasciiconstruct.com/post/junos-bgp-and-bgp-unnumbered/#ipv6-configuration-for-bgp-unnumbered

Mon, Sep 25, 9:01 AM · Infrastructure-Foundations, netops, SRE
ayounsi closed T334916: Juniper RA receive bug CVE-2023-28981 as Resolved.

Deployed

Mon, Sep 25, 8:11 AM · Infrastructure-Foundations, netops, SRE

Fri, Sep 22

ayounsi updated the task description for T345602: Repurpose three decom servers as temporary ganeti-test1001/1002 and ganeti-test2004.
Fri, Sep 22, 12:37 PM · Infrastructure-Foundations
ayounsi added a comment to T345867: Decommission furud.
Fri, Sep 22, 12:35 PM · decommission-hardware, SRE, ops-codfw
ayounsi renamed T345602: Repurpose three decom servers as temporary ganeti-test1001/1002 and ganeti-test2004 from Repurpose two decom servers as temporary ganeti-test1001/1002 to Repurpose three decom servers as temporary ganeti-test1001/1002 and ganeti-test2004.
Fri, Sep 22, 12:35 PM · Infrastructure-Foundations
ayounsi added a comment to T347148: Determine how to monitor services in cloud-private / cloudlb.

Personally I've no objection to the first option, just allowing it. But as you mention the policy and overall shape of things in terms of the "cross realm guidelines" needs to be considered. @ayounsi have you any thoughts here?

Prometheus monitors endpoints outside of WMF's network through the proxies, see T303803: Prometheus use of Squid proxies. Would that work for that usecase?

Fri, Sep 22, 12:26 PM · observability, cloud-services-team, Cloud-VPS

Thu, Sep 21

ayounsi added a comment to T306238: Netbox Juniper report.

In the set up the team asked for a couple more items. Can you also share the “aud” (audience) & cid (clientId)values from the ID token?

Thu, Sep 21, 2:21 PM · SRE, netops, Infrastructure-Foundations, netbox
ayounsi added a comment to T347054: Simplify maintenance of DNS/NTP hosts to reduce toil around reboots, reimages, and other work.

NTP automation:

Even if Debian Installer supports a coma separated list of NTP servers (to be tested?), some special appliances (like PDUs) only support 1 or 2 NTP servers.
So while it's best to have many servers configured (See for example https://labs.ripe.net/author/christer-weinigel/best-practices-for-connecting-to-ntp-servers/ ) and that's what Puppet does well, we need to have a "catch-all" option. For day to day maintenance there is no need to remove an NTP server from the Puppet managed "timesyncd.conf" file.

Thu, Sep 21, 2:19 PM · Patch-For-Review, SRE, Traffic
ayounsi added a comment to T344164: 1 VMs requested for stewards.

FYI, the underlying IRC library seems to support proxies https://github.com/aatxe/irc#configuring-irc-clients

Thu, Sep 21, 12:42 PM · collaboration-services, Infrastructure-Foundations, Stewards-and-global-tools, SRE, vm-requests
ayounsi added a comment to T344164: 1 VMs requested for stewards.

Indeed and hosts on public IPs have a much larger attack surface so they should be a last resort option. The ircbot might need to be audited too if it connects to servers outside of WMF.

Thu, Sep 21, 12:26 PM · collaboration-services, Infrastructure-Foundations, Stewards-and-global-tools, SRE, vm-requests
ayounsi added a comment to T318783: cr2-esams:FPC0 Parity error.

@cmooney I think this can be closed?

Thu, Sep 21, 12:18 PM · SRE, Infrastructure-Foundations, netops
ayounsi closed T338028: Users management on SONiC, a subtask of T320638: Add Dell switches support to Homer/Cookbooks, as Resolved.
Thu, Sep 21, 12:17 PM · Patch-For-Review, SRE, Infrastructure-Foundations, netops
ayounsi closed T338028: Users management on SONiC as Resolved.

This is done for now, more improvements to come from Dell, tracked in T342673.

Thu, Sep 21, 12:16 PM · SRE, Infrastructure-Foundations, netops
ayounsi closed T346759: Investigate and deploy 'max-repeaters = 20' to all librenms devices as Declined.

Thanks, I spent a bit more time on that.

Thu, Sep 21, 11:57 AM · SRE, Infrastructure-Foundations, Observability-Metrics, netops
ayounsi claimed T346759: Investigate and deploy 'max-repeaters = 20' to all librenms devices.
Thu, Sep 21, 11:57 AM · SRE, Infrastructure-Foundations, Observability-Metrics, netops
lmata awarded T344136: Upgrade LibreNMS to 23.7.0 or higher a Love token.
Thu, Sep 21, 2:23 AM · Observability-Metrics, SRE Observability (FY2023/2024-Q1)

Tue, Sep 19

ayounsi triaged T346779: cr1-esams:fpc0 errors as High priority.
Tue, Sep 19, 2:43 PM · SRE, Infrastructure-Foundations, netops
ayounsi reopened T341546: ganeti2014: broken RAM / mainboard as "Open".

This triggered netbox report alert ganeti2014 (WMF6747) mismatched serials: XXXXX (netbox) != YYYYY (puppetdb)
https://netbox.wikimedia.org/extras/reports/puppetdb.PhysicalHosts/

Tue, Sep 19, 12:02 PM · SRE, ops-codfw
ayounsi closed T331519: Should we have two versions of the Juniper QFX5120-48Y in Netbox? as Resolved.

All good now.

Tue, Sep 19, 11:53 AM · Infrastructure-Foundations, netbox
ayounsi committed rOSNE697ac6775190: LibreNMS report: remove MODEL_EXCLUDES filter (authored by ayounsi).
LibreNMS report: remove MODEL_EXCLUDES filter
Tue, Sep 19, 11:49 AM
ayounsi committed rOSNE41482a407642: LibreNMS report: add equivalent model strings (authored by ayounsi).
LibreNMS report: add equivalent model strings
Tue, Sep 19, 11:48 AM
ayounsi committed rOSNE74b63d6a1a64: LibreNMS report: use black formating (authored by ayounsi).
LibreNMS report: use black formating
Tue, Sep 19, 11:48 AM
ayounsi reopened T331519: Should we have two versions of the Juniper QFX5120-48Y in Netbox? as "Open".

Re-opening as the LibreNMS report needs to be updated to handle those discrepancies.

Tue, Sep 19, 6:54 AM · Infrastructure-Foundations, netbox
ayounsi closed T331519: Should we have two versions of the Juniper QFX5120-48Y in Netbox? as Resolved.

The support contract is different on the old vs. new licensing, so we need to be able to verify that the proper support is applied to our switches.

Tue, Sep 19, 6:15 AM · Infrastructure-Foundations, netbox

Mon, Sep 18

ayounsi added a comment to T346606: cr*-eqsin long poll times from librenms.

We had a quick chat on IRC.

Mon, Sep 18, 1:48 PM · SRE, Infrastructure-Foundations, netops, Observability-Metrics
ayounsi added a comment to T346600: Archive/delete https://gerrit.wikimedia.org/r/admin/repos/operations/software/netbox-reports.

The checklist is heavily oriented towards extensions and skins.

Mon, Sep 18, 10:21 AM · Projects-Cleanup, Infrastructure-Foundations
fgiunchedi awarded T346319: Some device mempool graphs can't be rendered in librenms a Like token.
Mon, Sep 18, 10:13 AM · Observability-Metrics, SRE Observability (FY2023/2024-Q1)
ayounsi closed T346319: Some device mempool graphs can't be rendered in librenms, a subtask of T344136: Upgrade LibreNMS to 23.7.0 or higher, as Resolved.
Mon, Sep 18, 10:12 AM · Observability-Metrics, SRE Observability (FY2023/2024-Q1)
ayounsi closed T346319: Some device mempool graphs can't be rendered in librenms as Resolved.

Yup, that was it.

Mon, Sep 18, 10:12 AM · Observability-Metrics, SRE Observability (FY2023/2024-Q1)
ayounsi added a comment to T346319: Some device mempool graphs can't be rendered in librenms.

Quick look makes me think that's related to devices that have been deleted.

Mon, Sep 18, 10:03 AM · Observability-Metrics, SRE Observability (FY2023/2024-Q1)
ayounsi added a comment to T346606: cr*-eqsin long poll times from librenms.

Probably a combination of latency (distance between netmon1003 and eqsin) with an increasing number of BGP peers.
Based on https://librenms.wikimedia.org/graphs/type=device_poller_modules_perf/device=159/from=1694940900/ most time is spent on BGP peers. Which is true for all routers, vs. ports for switches, which make sens.

Mon, Sep 18, 9:20 AM · SRE, Infrastructure-Foundations, netops, Observability-Metrics
ayounsi triaged T346600: Archive/delete https://gerrit.wikimedia.org/r/admin/repos/operations/software/netbox-reports as Low priority.
Mon, Sep 18, 7:08 AM · Projects-Cleanup, Infrastructure-Foundations

Fri, Sep 15

ayounsi closed Restricted Task, a subtask of T303242: ripe-atlas-esams down, as Resolved.
Fri, Sep 15, 9:20 AM · SRE, DC-Ops, ops-esams
ayounsi added a comment to T306238: Netbox Juniper report.

@jbond from Juniper:

Fri, Sep 15, 8:47 AM · SRE, netops, Infrastructure-Foundations, netbox
ayounsi closed T346317: Alert "access port speed less 100mbit" and librenms upgrade as Resolved.

Thanks, that's related to T336511: Access port speed <= 100Mbps False positives and I just removed the alert.

Fri, Sep 15, 8:44 AM · SRE, Infrastructure-Foundations, netops, Observability-Metrics, SRE Observability (FY2023/2024-Q1)
ayounsi closed T346317: Alert "access port speed less 100mbit" and librenms upgrade, a subtask of T344136: Upgrade LibreNMS to 23.7.0 or higher, as Resolved.
Fri, Sep 15, 8:44 AM · Observability-Metrics, SRE Observability (FY2023/2024-Q1)
ayounsi closed T336511: Access port speed <= 100Mbps False positives as Resolved.

I removed the alert as it was being problematic in T346317: Alert "access port speed less 100mbit" and librenms upgrade as well.

Fri, Sep 15, 8:43 AM · SRE, netops, DC-Ops, Infrastructure-Foundations
ayounsi added a comment to T339852: Configure ECMP hashing function on QFX5120 platform.

Good point! That was done before the VXLAN deployment to have more predictability on the anycast traffic to the end servers.

Fri, Sep 15, 8:34 AM · netops, Infrastructure-Foundations, SRE
ayounsi created T346421: Renumber esams-eqiad GTT link.
Fri, Sep 15, 7:37 AM · SRE, Infrastructure-Foundations, netops

Thu, Sep 14

ayounsi removed a project from T345738: etcd in codfw burned all latency SLO error budget: netops.
Thu, Sep 14, 6:18 AM · Patch-For-Review, SRE, Infrastructure-Foundations, serviceops
ayounsi added a comment to T252890: scrape ripe atlas data for a few anchors at other large networks.

@CDanis Is that still needed now that we have NEL?

Thu, Sep 14, 6:18 AM · Infrastructure-Foundations, netops, SRE

Wed, Sep 13

ayounsi created P52489 (An Untitled Masterwork).
Wed, Sep 13, 6:47 AM

Tue, Sep 12

ayounsi added a comment to T313634: Survey the third-party library market for UA policy compliance.

FYI:

Tue, Sep 12, 9:05 AM · SRE
ayounsi committed rOSHOf0e43f5a0809: Junos: Add more info on commit errors (authored by ayounsi).
Junos: Add more info on commit errors
Tue, Sep 12, 8:10 AM
ayounsi added a comment to T344136: Upgrade LibreNMS to 23.7.0 or higher.

@andrea.denisse. is there a task for this blocking issue? As more and more people are going to upgrade to bookworm thanks for finding those bugs.

Tue, Sep 12, 6:15 AM · Observability-Metrics, SRE Observability (FY2023/2024-Q1)

Mon, Sep 11

ayounsi added a comment to T342502: Inbound interface errors.

Unfortunately the errors are back, even though not much it's still better to fix the issue.

Mon, Sep 11, 6:43 AM · ops-eqiad

Fri, Sep 8

ayounsi added a parent task for T345809: Do we need ping offload servers at all POPs?: T345743: reprovision ping VM in esams.
Fri, Sep 8, 7:32 AM · Infrastructure-Foundations, Traffic, netops, SRE
ayounsi added a subtask for T345743: reprovision ping VM in esams: T345809: Do we need ping offload servers at all POPs?.
Fri, Sep 8, 7:32 AM · SRE, Traffic

Thu, Sep 7

ayounsi added a comment to T340444: Markdown bug in Netbox-next.

Probably not, probably, probably not.

Thu, Sep 7, 3:22 PM · Infrastructure-Foundations, netbox
ayounsi added a comment to T345738: etcd in codfw burned all latency SLO error budget.

Thanks, we had a quick chat on IRC about that and indeed that's the current conclusion. The extra details your provided (and fix suggestions) are welcome too!

Thu, Sep 7, 7:46 AM · Patch-For-Review, SRE, Infrastructure-Foundations, serviceops
ayounsi added a comment to T336485: Setup zero touch provisioning (ZTP) for network devices.

Please open a new task for that.

Thu, Sep 7, 7:02 AM · Patch-For-Review, SRE, Infrastructure-Foundations, netops, SRE-tools

Wed, Sep 6

ayounsi added a comment to T345710: Set idle-timeout for Juniper logins.

I thought that was not possible but it got introduced recently (in 16.1).

Wed, Sep 6, 1:27 PM · netops, Infrastructure-Foundations, SRE
ayounsi added a parent task for T345602: Repurpose three decom servers as temporary ganeti-test1001/1002 and ganeti-test2004: T300152: Investigate Ganeti in routed mode.
Wed, Sep 6, 7:20 AM · Infrastructure-Foundations
ayounsi added a subtask for T300152: Investigate Ganeti in routed mode: T345602: Repurpose three decom servers as temporary ganeti-test1001/1002 and ganeti-test2004.
Wed, Sep 6, 7:20 AM · SRE, netops, Ganeti, Infrastructure-Foundations
ayounsi added a comment to T344136: Upgrade LibreNMS to 23.7.0 or higher.

Thanks for the update it all makes sens to me!

Wed, Sep 6, 7:01 AM · Observability-Metrics, SRE Observability (FY2023/2024-Q1)

Tue, Sep 5

ayounsi committed rLPRIa3ecf8444cd0: Add mock TLS key for ganeti-test01.svc.eqiad.wmnet (authored by ayounsi).
Add mock TLS key for ganeti-test01.svc.eqiad.wmnet
Tue, Sep 5, 2:34 PM
ayounsi created T345602: Repurpose three decom servers as temporary ganeti-test1001/1002 and ganeti-test2004.
Tue, Sep 5, 8:47 AM · Infrastructure-Foundations
ayounsi added a comment to T345601: Maintain ROAs for currently unannounced BGP assignments.

Sounds good to me!

Tue, Sep 5, 8:43 AM · netops, Infrastructure-Foundations, SRE

Mon, Sep 4

ayounsi closed T334594: TLS certificates for network devices, a subtask of T320638: Add Dell switches support to Homer/Cookbooks, as Resolved.
Mon, Sep 4, 3:04 PM · Patch-For-Review, SRE, Infrastructure-Foundations, netops
ayounsi closed T334594: TLS certificates for network devices as Resolved.

This is now working in prod.

Mon, Sep 4, 3:04 PM · SRE, Infrastructure-Foundations, netops
ayounsi moved T344968: Test depool of drmrs from Backlog to Watching on the netops board.
Mon, Sep 4, 3:00 PM · SRE, Infrastructure-Foundations, netops, Traffic
ayounsi committed rOSNE69b886b973e5: Add MTU 9000 as valid option for NTT VPLS (authored by ayounsi).
Add MTU 9000 as valid option for NTT VPLS
Mon, Sep 4, 2:45 PM

Sep 1 2023

ayounsi added a parent task for T308339: eqiad: move non WMCS servers out of rack C8: T345263: September 2023 Datacenter Switchover.
Sep 1 2023, 9:43 AM · SRE, DBA, ops-eqiad
ayounsi added a subtask for T345263: September 2023 Datacenter Switchover: T308339: eqiad: move non WMCS servers out of rack C8.
Sep 1 2023, 9:43 AM · Performance-Team, Data-Persistence, serviceops, Datacenter-Switchover, SRE

Aug 31 2023

ayounsi added a comment to T306238: Netbox Juniper report.

Thanks, I submitted the on-boarding form, let's see what happens now.

Aug 31 2023, 4:45 PM · SRE, netops, Infrastructure-Foundations, netbox
ayounsi updated subscribers of T326322: Add per-output queue monitoring for Juniper network devices.

We have data https://grafana.wikimedia.org/d/iUATvNzSz/network-queues !
And a doc: https://wikitech.wikimedia.org/wiki/Network_telemetry

Aug 31 2023, 1:56 PM · Patch-For-Review, SRE, Infrastructure-Foundations, netops
ayounsi closed T334530: Adjust routing policy to increase SSH session speed from East Asia to toolforge as Resolved.

Rolled everywhere, another example, cr1-codfw:

before
  Prefix		  Nexthop	       MED     Lclpref    AS path
* 185.15.56.0/24          Self                                    ?
* 185.15.57.0/24          Self                                    ?
* 185.71.138.0/24         Self                                    I
* 198.35.27.0/24          Self                                    I
* 198.73.209.0/24         Self                                    11820 ?
* 208.80.152.0/23         Self                                    I
after
  Prefix		  Nexthop	       MED     Lclpref    AS path
* 185.15.57.0/24          Self                                    ?
* 185.71.138.0/24         Self                                    I
* 198.35.27.0/24          Self                                    I
* 208.80.152.0/23         Self                                    I

The SF office as well as eqiad WMCS range are gone, but codfw WMCS is still there.

Aug 31 2023, 1:08 PM · Infrastructure-Foundations, netops
ayounsi added a comment to T306238: Netbox Juniper report.

@jbond from Juniper, does it make sens?

“If the customer would like to use OIDC they enter in their token for us to use and authenticate. The vast majority of users sign up requesting OAuth2.0 where we’ll build them credentials instead and share with the customer.

Aug 31 2023, 8:24 AM · SRE, netops, netbox, Infrastructure-Foundations
ayounsi added a comment to T345273: Juniper ZTP fails on certain devices due to DHCP binding on management router.

FYI there is now a pending diff for:

[edit forwarding-options dhcp-relay]
+    /* T337345 */
+    forward-snooped-clients non-configured-interfaces;

On the L3 switches. That's as the latest patch is moving the statement outside of the if NOT l3_switch (else).
From my understanding that's the expected behavior, but as they've been working without it so far I'll leave it to you.

Aug 31 2023, 7:48 AM · netops, Infrastructure-Foundations, SRE

Aug 30 2023

ayounsi added a comment to T345273: Juniper ZTP fails on certain devices due to DHCP binding on management router.

Could we use forward-only everywhere once we move to DHCP option 97 with {T304677} ?

Aug 30 2023, 4:57 PM · netops, Infrastructure-Foundations, SRE
ayounsi added a comment to T326322: Add per-output queue monitoring for Juniper network devices.

I rolled the certificate to all the cloudsw, cr, and asw devices.
I enabled gnmic on all the cloudsw and asw devices.
I configured gnmic to pull the data from all the asw devices.

Aug 30 2023, 1:38 PM · Patch-For-Review, SRE, Infrastructure-Foundations, netops
ayounsi added a comment to T336485: Setup zero touch provisioning (ZTP) for network devices.

Before running homer, the cookbook needs to call the sre.network.tls cookbook with the device's name as parameter to add the TLS cert required by the config pushed by Homer.

Aug 30 2023, 12:12 PM · Patch-For-Review, SRE, Infrastructure-Foundations, netops, SRE-tools
ayounsi closed T327862: Use mgmt_junos on all network devices as Resolved.

Nevermind, still doesn't work on the fasw.

Aug 30 2023, 12:03 PM · SRE, netops, Infrastructure-Foundations
ayounsi reopened T327862: Use mgmt_junos on all network devices as "Open".

Re-opening as the fasw got upgraded since, so we can enable mgmt_junos

Aug 30 2023, 10:45 AM · SRE, netops, Infrastructure-Foundations