Page MenuHomePhabricator

ayounsi (Arzhel Younsi)
Network Engineer

Projects (8)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
Apr 3 2017, 6:23 PM (149 w, 6 d)
Availability
Available
IRC Nick
xionox
LDAP User
Ayounsi
MediaWiki User
AYounsi (WMF) [ Global Accounts ]

Recent Activity

Fri, Feb 14

ayounsi created P10416 email-rpki-unreach.py.
Fri, Feb 14, 6:18 PM · netops
ayounsi added a comment to T245121: RRDP status alert .

It's because Grafana reports Routinator pulling data from https://rrdp.ripe.net/notification.xml as a -1 on its graph. Where I think -1 means timeout.

Fri, Feb 14, 5:07 PM · Operations, netops

Thu, Feb 13

ayounsi triaged T245192: Investigate Juniper storm control as Medium priority.
Thu, Feb 13, 7:06 PM · Operations, Wikimedia-Incident, netops
ayounsi triaged T245188: Audit msw1-eqiad cables as Low priority.
Thu, Feb 13, 6:59 PM · Operations, ops-eqiad
ayounsi triaged T245176: Add Prometheus Squid exporter as Low priority.
Thu, Feb 13, 5:13 PM · observability
ayounsi added a comment to T245121: RRDP status alert .

I think this means that the query to that URL times out.
As it completes properly from codfw I'm wondering if it's not an issue with the webproxies (overloaded or similar).

Thu, Feb 13, 5:06 PM · Operations, netops
ayounsi added a comment to T245164: Alert for device ps1-a8-codfw.mgmt.codfw.wmnet - Device rebooted.

Still from LibreNMS:

2020-02-13 15:46:52 notice ps1-a8-codfw SENTRY3_5179AF] EVENT: System boot complete notice
2020-02-13 15:46:52 notice ps1-a8-codfw NO MATCH [Sentry3_5179af] EVENT: TCP/IP stack has started notice

Thu, Feb 13, 4:05 PM · Operations, ops-codfw
ayounsi triaged T245164: Alert for device ps1-a8-codfw.mgmt.codfw.wmnet - Device rebooted as Medium priority.
Thu, Feb 13, 4:03 PM · Operations, ops-codfw
ayounsi added a parent task for T245158: ganeti doesn't change the boot order to network: T244585: Upgrade rpki VMs to buster.
Thu, Feb 13, 3:07 PM · Operations
ayounsi added a subtask for T244585: Upgrade rpki VMs to buster: T245158: ganeti doesn't change the boot order to network.
Thu, Feb 13, 3:07 PM · Operations
ayounsi triaged T245158: ganeti doesn't change the boot order to network as High priority.
Thu, Feb 13, 3:06 PM · Operations
ayounsi closed T244584: Upgrade ping VMs to buster as Resolved.

Done.

Thu, Feb 13, 2:24 PM · Operations
nshahquinn-wmf awarded T241961: VisualEditor was removed from Wikitech because Parsoid/PHP isn't yet compatible with how Wikitech is set up a The World Burns token.
Thu, Feb 13, 3:01 AM · Parsoid, wikitech.wikimedia.org, Operations, VisualEditor
ayounsi added a comment to T243080: Upgrade routers.

Feb. 18th - 13:00UTC - 2h - cr2/3-esams

Thu, Feb 13, 1:11 AM · Operations, netops

Wed, Feb 12

ayounsi added a comment to T243080: Upgrade routers.

cr1-eqsin is back to normal, next step is to plan esams.

Wed, Feb 12, 11:53 PM · Operations, netops
ayounsi updated the task description for T243080: Upgrade routers.
Wed, Feb 12, 2:35 PM · Operations, netops
ayounsi closed T244944: cr1-eqsin routing engine crashlooping after JunOS upgrade as Resolved.

Looks solved.

Wed, Feb 12, 1:40 PM · Operations, netops
ayounsi added a comment to T244944: cr1-eqsin routing engine crashlooping after JunOS upgrade.

This is a known bug (in JTAC recommended), and need to upgrade to the next S release (S7).

Wed, Feb 12, 12:57 PM · Operations, netops
ayounsi added a comment to T244944: cr1-eqsin routing engine crashlooping after JunOS upgrade.

Attached RSI and /var/log/ as well as replied to their initial questions.

Wed, Feb 12, 1:50 AM · Operations, netops
ayounsi added a comment to T244944: cr1-eqsin routing engine crashlooping after JunOS upgrade.

Opened JTAC Service Request ID: 2020-0211-0750

Wed, Feb 12, 1:08 AM · Operations, netops
ayounsi added a comment to T243080: Upgrade routers.

cr1-eqsin RPD keeps crashing and core-dumping since the upgrade.
Opened JTAC Service Request ID: 2020-0211-0750

Wed, Feb 12, 1:05 AM · Operations, netops

Tue, Feb 11

ayounsi added a comment to T244761: Script to point SRE local machine traffic to another LB.

Anyone have a good idea on a name for such a thing?

closest_pop_on_fire.sh

Tue, Feb 11, 4:41 PM · Operations
ayounsi committed rOSNEd0aa8fd5f187: Blacklist eqsin PDU link device (authored by ayounsi).
Blacklist eqsin PDU link device
Tue, Feb 11, 3:48 PM

Mon, Feb 10

ayounsi triaged T244761: Script to point SRE local machine traffic to another LB as Low priority.
Mon, Feb 10, 4:28 PM · Operations

Sun, Feb 9

ayounsi added a comment to T232602: GRE MTU mitigations - Tracking.

T244610 too.

Sun, Feb 9, 10:19 PM · Operations, Traffic

Fri, Feb 7

ayounsi changed the edit policy for Routing knowledge.
Fri, Feb 7, 6:08 PM · netops
ayounsi triaged T244585: Upgrade rpki VMs to buster as Low priority.
Fri, Feb 7, 5:15 PM · Operations
ayounsi triaged T244584: Upgrade ping VMs to buster as Low priority.
Fri, Feb 7, 5:13 PM · Operations
ayounsi added a comment to T243080: Upgrade routers.

Current dates are:
Feb. 11th - 21:00UTC - 1h - cr1-eqsin - eqsin will be depooled (this is when eqsin sees the less traffic)
Feb. 12th - 13:00UTC - 2h - cr2/3-esams

Fri, Feb 7, 3:53 PM · Operations, netops
ayounsi added a comment to T242602: Sort out plan for install* servers in edge sites.

One use case I have of the install1002 server is:

  1. Download a Junos software image from Juniper to install1002
  2. Move it to /srv/junos/
  3. Fetch it over https with for example: file copy "https://install1002.wikimedia.org/junos/jinstall-ppc-17.3R3-S6.3-signed.tgz" /var/tmp/jinstall-ppc-17.3R3-S6.3-signed.tgz

Note that browsing that URL with Firefox shows an https error:

Firefox does not trust this site because it uses a certificate that is not valid for install1002.wikimedia.org. The certificate is only valid for apt.wikimedia.org.

But Junos is (so far) smart enough to ignore it. /s

Fri, Feb 7, 3:35 PM · Patch-For-Review, Operations
ayounsi added a comment to T240659: BFD session alerts due to inconsistent status on cr3-knams.

cr3-knams got upgraded to 18 yesterday. Waiting to see if the issue happen again.

Fri, Feb 7, 2:09 PM · Operations, netops
ayounsi added a comment to T244497: cr3-knams:xe-0/1/3 down.

Opened T244574.

Fri, Feb 7, 2:05 PM · netops, Operations
ayounsi reassigned T244497: cr3-knams:xe-0/1/3 down from ayounsi to faidon.

As it matches the reboot of cr3-knams I'd say the optic on that side needs to be replaced. (maybe the power fluctuation damaged it?).

Fri, Feb 7, 1:31 PM · netops, Operations

Thu, Feb 6

ayounsi added a comment to T244497: cr3-knams:xe-0/1/3 down.

Other side doesn't receive the light though:

ayounsi@asw2-esams> show interfaces diagnostics optics xe-6/0/4   
Physical interface: xe-6/0/4
    Laser output power                        :  1.3540 mW / 1.32 dBm
    Laser receiver power                      :  0.0001 mW / -40.00 dBm
Thu, Feb 6, 4:03 PM · netops, Operations
ayounsi triaged T244497: cr3-knams:xe-0/1/3 down as High priority.
Thu, Feb 6, 4:00 PM · netops, Operations

Wed, Feb 5

ayounsi triaged T244363: Homer: commit timeout on MX104 and SRXs as Medium priority.
Wed, Feb 5, 2:39 PM · Operations, SRE-tools
ayounsi triaged T244362: Homer: commit> no causes stacktrace as Low priority.
Wed, Feb 5, 2:35 PM · Operations, SRE-tools

Tue, Feb 4

ayounsi updated the task description for T205897: Netbox: fill network topology.
Tue, Feb 4, 10:36 PM · netbox, Operations
ayounsi closed T228388: Configuration management for network operations as Resolved.
Tue, Feb 4, 10:34 PM · Patch-For-Review, Wikimedia-Incident, Operations, Goal, netops, SRE-tools
ayounsi added a comment to T228388: Configuration management for network operations.

Everything here is done.
Doc is there https://wikitech.wikimedia.org/wiki/Homer and has been tested by other SREs than Riccardo or me.

Tue, Feb 4, 10:34 PM · Patch-For-Review, Wikimedia-Incident, Operations, Goal, netops, SRE-tools
ayounsi updated the task description for T228388: Configuration management for network operations.
Tue, Feb 4, 10:32 PM · Patch-For-Review, Wikimedia-Incident, Operations, Goal, netops, SRE-tools
ayounsi closed T240670: WMCS: cleanup network allocations as Resolved.

I think everything is done here?

Tue, Feb 4, 10:27 PM · netops, Operations, cloud-services-team (Kanban)
ayounsi added a comment to T244238: Upgrade and restart m1 master (db1135).

Anytime works for LibreNMS.

Tue, Feb 4, 2:24 PM · Wikimedia-Etherpad, DBA, Operations
ayounsi added a comment to T244196: codfw: Delete cloud interface-range.

Yep, it's fine to delete it if there are no more member interfaces.

Tue, Feb 4, 12:37 PM · Operations, ops-codfw, netops

Mon, Feb 3

ayounsi added a comment to T240659: BFD session alerts due to inconsistent status on cr3-knams.

Now that the issue is on the cr1-eqiad to cr3-knams link, I'm going to push the following:

cr3-knams
[edit system syslog]
     file messages { ... }
+    file bfd-fw {
+        firewall any;
+    }
[edit interfaces xe-0/1/5 unit 13 family inet6]
+       filter {
+           input test-in;
+           output test-out;
+       }
[edit firewall family inet6]
+     filter test-in {
+         term 10 {
+             from {
+                 next-header udp;
+                 port 3784;
+             }
+             then {
+                 count bfd-in;
+                 syslog;
+                 accept;
+             }
+         }
+         term 20 {
+             then accept;
+         }
+     }
+     filter test-out {
+         term 10 {
+             from {
+                 next-header udp;
+                 port 3784;
+             }
+             then {
+                 count bfd-out;
+                 syslog;
+                 accept;
+             }
+         }
+         term 20 {
+             then accept;
+         }
+     }
      filter border-in6 { ... }
cr1-eqiad
[edit system syslog]
     file messages { ... }
+    file bfd-fw {
+        firewall any;
+    }
[edit interfaces xe-4/2/2 unit 13 family inet6]
+       filter {
+           input test-in;
+           output test-out;
+       }
[edit firewall family inet6]
+     filter test-in {
+         term 10 {
+             from {
+                 next-header udp;
+                 port 3784;
+             }
+             then {
+                 count bfd-in;
+                 syslog;
+                 accept;
+             }
+         }
+         term 20 {                     
+             then accept;
+         }
+     }
+     filter test-out {
+         term 10 {
+             from {
+                 next-header udp;
+                 port 3784;
+             }
+             then {
+                 count bfd-out;
+                 syslog;
+                 accept;
+             }
+         }
+         term 20 {
+             then accept;
+         }
+     }
      filter border-in6 { ... }
Mon, Feb 3, 3:33 PM · Operations, netops
ayounsi removed a project from T244127: cp3057 crash (was: network down): netops.
Mon, Feb 3, 12:14 PM · ops-esams, Operations, Traffic

Thu, Jan 30

ayounsi added a comment to T211706: Superset Updates .

Playing around Superset I came across those already reported bugs:
https://github.com/apache/incubator-superset/issues/7649
https://github.com/apache/incubator-superset/issues/7327
Can be observed in https://superset.wikimedia.org/superset/explore/?form_data=%7B%22slice_id%22%3A%20399%7D
and https://superset.wikimedia.org/superset/explore/?form_data=%7B%22slice_id%22%3A%20401%7D

Thu, Jan 30, 10:07 PM · Better Use Of Data, Analytics-Kanban, Product-Analytics

Wed, Jan 29

ayounsi added a comment to T223934: Add annotations from ops vendor maintenance calendar to Grafana.

See also T230835.

Wed, Jan 29, 7:08 PM · Operations

Tue, Jan 28

ayounsi closed T243821: mr1-eqiad.oob IPv6 is down as Resolved.

Router bug, confirmed fixed by NTT.

Tue, Jan 28, 11:35 PM · Operations
ayounsi added a comment to T243821: mr1-eqiad.oob IPv6 is down.
not working
ayounsi@icinga1001:~$ mtr -z --report-wide 2607:f6f0:205::153
Start: Tue Jan 28 19:51:46 2020
HOST: icinga1001                                          Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS14907  ae3-1003.cr1-eqiad.wikimedia.org             0.0%    10    0.3   0.5   0.3   1.3   0.0
  2. AS2914   xe-0-0-28-0.a03.asbnva02.us.bb.gin.ntt.net   0.0%    10    0.5   4.3   0.3  20.0   7.9
  3. AS2914   ae-70.r06.asbnva02.us.bb.gin.ntt.net         0.0%    10    1.3   1.0   0.9   1.3   0.0
  4. AS???    ???                                         100.0    10    0.0   0.0   0.0   0.0   0.0
ayounsi@icinga1001:~$
working
ayounsi@bast4002:~$ mtr -z --report-wide 2607:f6f0:205::153
Start: Tue Jan 28 19:51:47 2020
HOST: bast4002                                             Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS14907  et-1-0-1-1201.cr3-ulsfo.wikimedia.org         0.0%    10    0.3   3.5   0.2  30.4   9.4
  2. AS14907  et-0-0-0-2.cr4-ulsfo.wikimedia.org            0.0%    10    0.5   0.7   0.2   4.0   1.1
  3. AS2914   xe-0-1-0-3-6.r05.plalca01.us.bb.gin.ntt.net   0.0%    10    1.3   1.4   1.3   1.6   0.0
  4. AS2914   ae-15.r01.snjsca04.us.bb.gin.ntt.net          0.0%    10    2.1   2.0   2.0   2.1   0.0
  5. AS2914   ae-1.r22.snjsca04.us.bb.gin.ntt.net           0.0%    10    1.9   1.9   1.8   2.2   0.0
  6. AS2914   ae-7.r23.asbnva02.us.bb.gin.ntt.net           0.0%    10   64.0  64.3  64.0  65.0   0.0
  7. AS2914   ae-2.r05.asbnva02.us.bb.gin.ntt.net           0.0%    10   68.5  68.5  68.4  68.6   0.0
  8. AS2914   ae-0.a02.asbnva02.us.bb.gin.ntt.net           0.0%    10   69.8  71.1  67.8  78.3   3.4
  9. AS2914   2001:418:0:5000::aab                          0.0%    10   61.5  62.4  61.4  70.4   2.8
 10. AS12085  2607:f6f0:1000:1af::2                         0.0%    10   61.9  64.8  61.8  90.9   9.2
 11. AS12085  ge-0-0-5.mr1-eqiad.wikimedia.org              0.0%    10   63.1  64.1  63.0  72.6   2.9
Tue, Jan 28, 8:01 PM · Operations
ayounsi closed T243821: mr1-eqiad.oob IPv6 is down as Resolved.

It's back, I'd guess a transient error on Equinix's network. Not worth investigating it more as it's now up and only OOB.

Tue, Jan 28, 7:17 PM · Operations

Wed, Jan 22

ayounsi closed Restricted Task, a subtask of T242265: rack/setup/install frlog2001.frack.codfw.wmnet, as Resolved.
Wed, Jan 22, 4:14 PM · Operations, ops-codfw, fundraising-tech-ops

Tue, Jan 21

ayounsi renamed T242097: mr1-esams i2c syslog flood from mr1-esams RMA (2020 edition) to mr1-esams i2c syslog flood.
Tue, Jan 21, 8:47 PM · Operations, netops
ayounsi added a comment to T242097: mr1-esams i2c syslog flood.

Errors are still there...

Tue, Jan 21, 8:19 PM · Operations, netops
ayounsi added a comment to T240659: BFD session alerts due to inconsistent status on cr3-knams.

Note that the above probably reset the sessions, as they are now up.

Tue, Jan 21, 7:30 PM · Operations, netops
ayounsi added a comment to T240659: BFD session alerts due to inconsistent status on cr3-knams.

From JTAC:

Been checking this issue with one of my seniors, on MX204 cr3-knams can we set on the below command:
Tue, Jan 21, 7:19 PM · Operations, netops
ayounsi added a comment to T213843: Juniper network device audit - all sites.

If we do abstraction of all the power supplies (that the CR above is for), there are still inconsistencies, but the list is progressively shrinking.
Some are fixed in one of their database, but didn't get reflected in the my.juniper.net portal.
cr2-esams says support missing, while the entitlement tool says support is active
cr3-knams says city missmatch
mr1-esams says missing
many decommissioned devices are still present in my.juniper.net (Maybe we should ignore those?)
many FPCs, MIC, REs, report as "not present in Juniper Installed Base" but some (eg. FPCs) are present in the entitlement tool (because under dedicated warranty)

Tue, Jan 21, 6:40 PM · DC-Ops, netops, Operations

Jan 17 2020

ayounsi updated the task description for T243080: Upgrade routers.
Jan 17 2020, 4:06 PM · Operations, netops
ayounsi triaged T243080: Upgrade routers as Low priority.
Jan 17 2020, 3:16 PM · Operations, netops
ayounsi added a comment to T242097: mr1-esams i2c syslog flood.

JTAC recommends to upgrade to the current Junos recommended, 18.2R3-S2.9.

Jan 17 2020, 11:59 AM · Operations, netops
ayounsi closed T243002: asw-b-codfw: fixes for openstack as Resolved.

Synced up on IRC, change pushed.

Jan 17 2020, 11:03 AM · Operations, netops, cloud-services-team (Kanban)
ayounsi closed T243002: asw-b-codfw: fixes for openstack, a subtask of T240357: cloudnet2003-dev.codfw.wmnet doesn't actually work as a network node, as Resolved.
Jan 17 2020, 11:03 AM · cloud-services-team (Kanban)
ayounsi added a comment to T243002: asw-b-codfw: fixes for openstack.
ayounsi@asw-b-codfw# show | compare 
[edit interfaces]
    interface-range vlan-private1-a-codfw { ... }
+   interface-range cloud-net-trunk {
+       member ge-5/0/42;
+       member ge-1/0/17;
+       mtu 9192;
+       unit 0 {
+           family ethernet-switching {
+               interface-mode trunk;
+               vlan {
+                   members [ cloud-instances2-b-codfw cloud-instance-transport1-b-codfw ];
+               }
+           }
+       }
+   }
-   interface-range cloud-instance-ports {
-       member ge-1/0/17;
-       unit 0 {
-           family ethernet-switching {
-               interface-mode trunk;
-               vlan {
-                   members cloud-instances1-b-codfw;
-               }
-           }
-       }
-   }                                   
[edit interfaces ge-5/0/42]
-   enable;
-   unit 0 {
-       family ethernet-switching {
-           interface-mode trunk;
-           vlan {
-               members [ cloud-instances2-b-codfw cloud-instance-transport1-b-codfw cloud-instances1-b-codfw ];
-           }
-       }
-   }
Jan 17 2020, 10:56 AM · Operations, netops, cloud-services-team (Kanban)
ayounsi changed the profile image for blog Routing knowledge.
Jan 17 2020, 10:21 AM · netops
ayounsi updated the description for Routing knowledge.
Jan 17 2020, 9:41 AM · netops
ayounsi changed the header image for blog Routing knowledge.
Jan 17 2020, 9:41 AM · netops
ayounsi changed the edit policy for Routing knowledge.
Jan 17 2020, 9:26 AM · netops
ayounsi changed the header image for blog Routing knowledge.
Jan 17 2020, 7:42 AM · netops
ayounsi changed the header image for blog Routing knowledge.
Jan 17 2020, 7:41 AM · netops
ayounsi updated the subtitle for Routing knowledge.
Jan 17 2020, 7:36 AM · netops

Jan 16 2020

ayounsi closed T220669: RPKI Validation as Resolved.

We now reject all RPKI invalid, from peering and transits, without any default route.
So far everything looks good. Blogpost to follow in the next few days.

Jan 16 2020, 1:09 PM · Patch-For-Review, Operations, netops

Jan 15 2020

ayounsi closed T190090: Offload pings to dedicated server as Resolved.

Done, dashboard and doc updated.

Jan 15 2020, 9:56 AM · Patch-For-Review, netops, Traffic, Operations
ayounsi added a comment to T242828: Add POP Ganeti clusters to makevm cookbook.

Slightly related, the makevm script on the Ganeti clusters only accepts a single character in the "row" question:

Please enter the correct row. (A, B or C - gnt-group list to show)
O
Jan 15 2020, 8:27 AM · Operations
ayounsi triaged T242828: Add POP Ganeti clusters to makevm cookbook as Low priority.
Jan 15 2020, 8:00 AM · Operations

Jan 14 2020

ayounsi added a comment to T242481: d-i fails to install on servers with BRCM 2P 1G BT + 2P 10G SFP NDC.

Is that because of different cables/connectors?

Indeed, 1G switch ports are RJ45, 10G are SFP. We could try to put an SFP-T on the server side. But I don't think it will work.

Jan 14 2020, 4:02 PM · Operations, DBA, ops-codfw
ayounsi added a comment to T242481: d-i fails to install on servers with BRCM 2P 1G BT + 2P 10G SFP NDC.
  • Enable to 10G even though it will go to a 1G switch port? Is that even possible?

Not afaik.

Jan 14 2020, 3:57 PM · Operations, DBA, ops-codfw
ayounsi claimed T190090: Offload pings to dedicated server.
Jan 14 2020, 2:18 PM · Patch-For-Review, netops, Traffic, Operations
ayounsi closed T242318: Stale LibreNMS ports as Resolved.
root@cumin1001:~# for i in `mysql.py -hdb1135 -e "select table_name from information_schema.columns where column_name like 'device_id'" -BN`; do echo $i; mysql.py -hdb1135 librenms -e "delete from $i where device_id=20 limit 1;";done

Then checked that the good records got deleted.
Then deleted them all.

Jan 14 2020, 9:06 AM · netops, Operations
ayounsi reassigned T190090: Offload pings to dedicated server from ayounsi to BBlack.

That sounds like a good idea to me, @BBlack for a final opinion, and I can take care of it this Q if good to go.

Jan 14 2020, 8:50 AM · Patch-For-Review, netops, Traffic, Operations
ayounsi triaged T242715: Anycast for webproxies as Medium priority.
Jan 14 2020, 8:29 AM · Operations

Jan 13 2020

ayounsi added a comment to T236744: track NIC firmware version numbers across the fleet.

This might help issues like T242481

Jan 13 2020, 9:09 AM · Patch-For-Review, Operations, Traffic
ayounsi added a comment to T240659: BFD session alerts due to inconsistent status on cr3-knams.

Removed BFD traceoptions on cr1-eqiad, keeping knams-eqdfw down for JTAC investigation.

Jan 13 2020, 7:34 AM · Operations, netops

Jan 9 2020

ayounsi added a comment to T242318: Stale LibreNMS ports.

From @Marostegui, the list of tables that have rows with device_id = 20: P10095#59005

Jan 9 2020, 8:13 AM · netops, Operations
ayounsi triaged T242318: Stale LibreNMS ports as Low priority.
Jan 9 2020, 7:32 AM · netops, Operations
ayounsi added a comment to T240906: CA App Synthetic Monitor Mail (SMTP): Connection timed out; connect(): -2.
  • Is it always the same source Watchmouse probe failing or "random" ones?
  • What does the check do exactly? (TCP, more L7 checks?)
  • Is the check configured to retry or email on the first issue?
  • As discussed during the meeting, reaching out to Watchmouse' support is probably the best next step if we ruled out all issues on our side
  • A packet capture of a working vs. non working probing would be useful but tricky to get
Jan 9 2020, 5:38 AM · Operations, Mail

Jan 8 2020

ayounsi closed T242197: Upgrade routinator to 0.6.4 as Resolved.

Grafana dashboard updated as well to expose RRDP stats.

Jan 8 2020, 4:49 PM · Patch-For-Review, Operations, netops
ayounsi added a comment to T242197: Upgrade routinator to 0.6.4.

Confirmed that RRDP works with the proxies:
Jan 8 14:30:27 rpki2001 routinator[11771]: Response: '200 OK' for https://rpki.[...]

Jan 8 2020, 2:32 PM · Patch-For-Review, Operations, netops
ayounsi triaged T242197: Upgrade routinator to 0.6.4 as Low priority.
Jan 8 2020, 9:11 AM · Patch-For-Review, Operations, netops
ayounsi changed the status of T240817: Routinator RSYNC errors from Open to Stalled.
Jan 8 2020, 8:58 AM · Operations, netops

Jan 7 2020

ayounsi claimed T242097: mr1-esams i2c syslog flood.
Jan 7 2020, 11:22 AM · Operations, netops
ayounsi triaged T242097: mr1-esams i2c syslog flood as Medium priority.
Jan 7 2020, 11:20 AM · Operations, netops
ayounsi added a comment to T167689: Add RIPE atlas data to Prometheus.

Thanks this is really nice!

Jan 7 2020, 8:47 AM · observability, Operations
ayounsi closed T241962: Upgrade LibreNMS to 1.59 as Resolved.

Went smoothly.

Jan 7 2020, 8:22 AM · Operations, netops

Jan 6 2020

ayounsi triaged T241965: Use check_dns_query for anycast DNS checks as Low priority.
Jan 6 2020, 7:08 AM · observability, Operations
ayounsi triaged T241962: Upgrade LibreNMS to 1.59 as Low priority.
Jan 6 2020, 5:11 AM · Operations, netops
ayounsi placed T241374: fastnetmon misreports attack type and protocol up for grabs.

Known issue: https://github.com/pavel-odintsov/fastnetmon/issues/787#issuecomment-570740316
I don't see it being solved anytime soon.

Jan 6 2020, 5:04 AM · Patch-For-Review, Operations, netops
ayounsi triaged T241961: VisualEditor was removed from Wikitech because Parsoid/PHP isn't yet compatible with how Wikitech is set up as Medium priority.
Jan 6 2020, 4:56 AM · Parsoid, wikitech.wikimedia.org, Operations, VisualEditor

Jan 3 2020

ayounsi added a comment to T228387: Bare metal cloud: management interfaces.

LGTM, a couple bike-shed like comments:

Jan 3 2020, 12:12 PM · Patch-For-Review, User-crusnov, Goal, SRE-tools

Jan 2 2020

ayounsi reopened T240817: Routinator RSYNC errors as "Open".

Opened https://github.com/NLnetLabs/routinator/issues/267 upstream. As rsync://localhost/repo/ has been alerting for 10 days now. And there is not much we can do.

Jan 2 2020, 1:29 PM · Operations, netops
ayounsi claimed T241374: fastnetmon misreports attack type and protocol .

Opened https://github.com/pavel-odintsov/fastnetmon/issues/787

Jan 2 2020, 9:06 AM · Patch-For-Review, Operations, netops