ayounsi (Arzhel Younsi)
Network Engineer

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Apr 3 2017, 6:23 PM (80 w, 1 d)
Availability
Available
IRC Nick
xionox
LDAP User
Ayounsi
MediaWiki User
AYounsi (WMF) [ Global Accounts ]

Recent Activity

Today

ayounsi updated the task description for T204170: Rack/setup cr2-eqord.
Wed, Oct 17, 11:10 AM · netops, Operations, ops-eqiad
ayounsi added a comment to T205985: Renumber office-DC interconnect link.

I installed Quagga in a VM to verify the commands, but there will most likely be differences with the office Quagga.
Only things that might need to be added is updating iptables.
All steps are easy to revert if complications occurs.

Wed, Oct 17, 10:43 AM · Patch-For-Review, Operations, netops
ayounsi closed T207035: relabel switch interfaces formerly saiph.frack.codfw.wmnet to frpig2001.frack.codfw.wmnet as Resolved.

Renamed.

Wed, Oct 17, 8:28 AM · netops, Operations
ayounsi closed T207035: relabel switch interfaces formerly saiph.frack.codfw.wmnet to frpig2001.frack.codfw.wmnet, a subtask of T203521: rename saiph.frack.codfw.wmnet to frpig2001.frack.codfw.wmnet and reimage with Debian Stretch, as Resolved.
Wed, Oct 17, 8:27 AM · Patch-For-Review, fundraising-tech-ops
ayounsi added a comment to T207138: Document eqsin power connections in Netbox.

Oct 14 19:05:27 asw1-eqsin craftd[1962]: Minor alarm cleared, FPC 0 PEM 0 is not powered
Oct 14 19:05:28 asw1-eqsin craftd[1962]: Minor alarm cleared, FPC 1 PEM 0 is not powered

Wed, Oct 17, 8:12 AM · Traffic, Operations
ayounsi reopened T203719: Interface errors on cr2-eqiad:xe-4/0/0 as "Open".

Re-opening as we're seeing errors again (at a lower rate, but errors nonetheless)

Wed, Oct 17, 7:52 AM · Operations, netops, ops-eqiad

Yesterday

ayounsi added a comment to T206331: Git push and pull don't complete.

@ayounsi Let me know if this is not ok to you. I swapped cobalt's ipv4/6 in the analytics-in4/6 filters, it seems not correct to me.

Tue, Oct 16, 3:21 PM · User-Elukey, Analytics, Analytics-Wikistats
ayounsi added a comment to T122406: Consider renumbering Labs to separate address spaces.

No objections for me.

Tue, Oct 16, 2:45 PM · Cloud-Services, netops, Operations
ayounsi closed T203261: cr2-eqdfw (MX204) vhclient log noise as Resolved.

Confirmed no more noisy logs.

Tue, Oct 16, 2:36 PM · netops, Operations
ayounsi added a comment to T203261: cr2-eqdfw (MX204) vhclient log noise.

Steps for the upgrade:

  • Verify image checksum and validate request system software validate /var/tmp/junos-vmhost-install-mx-x86-64-17.4R2.4.tgz
  • Start upgrade process request system software add /var/tmp/junos-vmhost-install-mx-x86-64-17.4R2.4.tgz no-validate no-copy
  • Downtime monitoring
  • Disable external BGP sessions on cr2
deactivate protocols bgp group IX4
deactivate protocols bgp group IX6
deactivate protocols bgp group Transit4
deactivate protocols bgp group Transit6
  • Reboot router request router reboot
  • Verify back online / no outstanding alerts
  • Enable external BGP rollback 1
Tue, Oct 16, 8:02 AM · netops, Operations
ayounsi added a comment to T206114: Create an Icinga check to alert on packet dropped.

What should be the runbook/actions when this alert goes off?

Tue, Oct 16, 7:21 AM · Discovery-Search (Current work), Patch-For-Review, monitoring, Operations

Mon, Oct 15

ayounsi added a comment to T133387: Enabling IGMP snooping on QFX switches breaks IPv6 (HTCP purges flood across codfw).

No real update since a year ago. All switch stacks have been upgraded to a version that doesn't have this specific bug (14.1X53-D43.7) except asw2-d-eqiad (still on 14.1X53-D42.3, see T172459).

Mon, Oct 15, 4:07 PM · Patch-For-Review, netops, Operations
ayounsi added a comment to T206972: asw2-a-eqiad FPC7 faulty PEM0.

PEM is dead, RMA# R200206473 created.

Mon, Oct 15, 3:40 PM · netops, Operations, ops-eqiad
ayounsi closed T200838: v6 ND failure on puppetmaster1001/asw2-b-eqiad, a subtask of T183585: Rack/cable/configure asw2-b-eqiad switch stack, as Resolved.
Mon, Oct 15, 1:01 PM · netops, cloud-services-team, Cloud-VPS, Operations, ops-eqiad
ayounsi closed T200838: v6 ND failure on puppetmaster1001/asw2-b-eqiad as Resolved.

Addressed in T201039#4650390

Mon, Oct 15, 1:01 PM · Operations
ayounsi reopened T205829: IPv6 ping to eqiad on ripe-atlas-eqiad IPv6 noisy alert as "Open".

The IPv6 pings eqiad alert keeps flapping, I downtimed it for 2 days and emailed the RIPE.

Mon, Oct 15, 9:08 AM · Patch-For-Review, Operations, netops
ayounsi claimed T206778: Configure v6 OOB for ulsfo.
Mon, Oct 15, 6:56 AM · Patch-For-Review, Operations, netops
ayounsi closed T206778: Configure v6 OOB for ulsfo as Resolved.

All set.

Mon, Oct 15, 6:56 AM · Patch-For-Review, Operations, netops
ayounsi added a comment to T206861: Power incident in eqsin.

Confirmed that all the network devices are back to a healthy state. And we received a completion notice, should be safe to repool the site.

Mon, Oct 15, 6:39 AM · Wikimedia-Incident, Operations, Traffic
ayounsi triaged T206972: asw2-a-eqiad FPC7 faulty PEM0 as High priority.
Mon, Oct 15, 6:06 AM · netops, Operations, ops-eqiad

Fri, Oct 12

ayounsi updated the task description for T206861: Power incident in eqsin.
Fri, Oct 12, 2:33 PM · Wikimedia-Incident, Operations, Traffic
ayounsi triaged T206861: Power incident in eqsin as Normal priority.
Fri, Oct 12, 2:32 PM · Wikimedia-Incident, Operations, Traffic

Thu, Oct 11

ayounsi triaged T206778: Configure v6 OOB for ulsfo as Low priority.
Thu, Oct 11, 5:12 PM · Patch-For-Review, Operations, netops

Wed, Oct 10

ayounsi closed T206704: Enable access from icinga1001 to mgmt interfaces as Resolved.

Management firewall policies updates.

Wed, Oct 10, 9:58 PM · Operations, netops
ayounsi added a comment to T201139: Intermittent connectivity issues in eqiad's row C.

There are 2 parallel issues here.

Wed, Oct 10, 7:05 PM · netops, Operations
ayounsi triaged T206653: WMCS public range diffscan as Low priority.
Wed, Oct 10, 4:20 PM · Patch-For-Review, Cloud-Services

Tue, Oct 9

ayounsi closed T205829: IPv6 ping to eqiad on ripe-atlas-eqiad IPv6 noisy alert as Resolved.

That should be good enough to make the alerts useful by removing the "false positive".

Tue, Oct 9, 8:52 PM · Patch-For-Review, Operations, netops
ayounsi added a comment to T205829: IPv6 ping to eqiad on ripe-atlas-eqiad IPv6 noisy alert.

I took the 16 hosts unable to reach the eqsin anchor over v6 during the last measurement (https://atlas.ripe.net/measurements/11645088/) and ran traceroutes from them to the eqsin anchor ( https://atlas.ripe.net/measurements/16451446/#!tracemon ) as well as from bat5001 to some of them.

Tue, Oct 9, 6:38 PM · Patch-For-Review, Operations, netops

Mon, Oct 8

ayounsi added a comment to T201039: connectivity issues between several hosts on asw2-b-eqiad.

cp1081 and cp1079, both on asw2-b-eqiad, are also having IPv6 connectivity issues with lvs1001:

Mon, Oct 8, 8:10 PM · Patch-For-Review, Operations, netops
ayounsi added a comment to T201039: connectivity issues between several hosts on asw2-b-eqiad.

Temporarily disable IGMP snooping on the interfaces to narrow down the issue.

[edit protocols igmp-snooping vlan all]
+     interface ge-6/0/46.0 {
+         multicast-router-interface;
+     }
+     interface ge-4/0/14.0 {
+         multicast-router-interface;
+     }
Mon, Oct 8, 7:44 PM · Patch-For-Review, Operations, netops
ayounsi added a comment to T201039: connectivity issues between several hosts on asw2-b-eqiad.

Followed up with JTAC, we can see the NS packets making it into the fabric:

# run show firewall    
Filter: v6-ns-lvs1002-ge-6/0/46.0-i                            
Counters:
Name                                                Bytes              Packets
v6-ns-lvs1002-ge-6/0/46.0-i                           282                    3
Mon, Oct 8, 4:58 PM · Patch-For-Review, Operations, netops
ayounsi added a comment to T201039: connectivity issues between several hosts on asw2-b-eqiad.

Working with JTAC on this.

Mon, Oct 8, 4:27 PM · Patch-For-Review, Operations, netops

Fri, Oct 5

ayounsi added a comment to T201039: connectivity issues between several hosts on asw2-b-eqiad.

Opened Juniper case 2018-1005-0549 about the ND issue.

Fri, Oct 5, 7:10 PM · Patch-For-Review, Operations, netops

Thu, Oct 4

ayounsi added a comment to T205898: Netbox: explore NAPALM integration.

https://netbox.wikimedia.org/api/dcim/devices/1945/napalm/

NAPALM is not installed. Please see the documentation for instructions.

Thu, Oct 4, 8:31 PM · Patch-For-Review, Operations
ayounsi updated subscribers of T201039: connectivity issues between several hosts on asw2-b-eqiad.

Some post-maintenance notes:

  • Need new optics are needed to connect fpc8 to fpc2, (all spares have been used) @Cmjohnson to follow up
  • Some members went briefly (<30s) offline during the re-cabling, taking down servers connected to them:

17:02:04 fpc8
~17:05: fpc7 and what was temporarily single homed to it (fpc6/8)
17:09:08 fpc6
17:10:21 fpc6

  • IPv6 ND is not working between lvs1002 and phab1002 (which looks similar to the issues that led to the creation of that task)

lvs1002 has been depooled, asw2-b's switch port to lvs1002 has been bounced with no success, next step is to bounce phab1001's switch port.

Thu, Oct 4, 6:48 PM · Patch-For-Review, Operations, netops

Wed, Oct 3

ayounsi closed T201145: asw2-a-eqiad FPC5 gets disconnected every 10 minutes as Resolved.

This is now stable. Back to T187960 for the remaining steps.

Wed, Oct 3, 6:26 PM · Wikimedia-Incident, Operations, netops
ayounsi added a comment to T201039: connectivity issues between several hosts on asw2-b-eqiad.

Steps to migrate asw2-b-eqiad to a supported topology.

Wed, Oct 3, 5:47 PM · Patch-For-Review, Operations, netops
ayounsi closed T189552: Rack/cable/configure ulsfo MX204 as Resolved.
Wed, Oct 3, 4:33 PM · Patch-For-Review, Operations, ops-ulsfo, netops, Traffic
ayounsi closed T189552: Rack/cable/configure ulsfo MX204, a subtask of T199142: Increase network capacity (2018-19 Q1 Goal), as Resolved.
Wed, Oct 3, 4:33 PM · Operations, Goal, netops
ayounsi updated the task description for T205897: Netbox: fill network topology.
Wed, Oct 3, 4:21 PM · Operations
ayounsi added a comment to T201145: asw2-a-eqiad FPC5 gets disconnected every 10 minutes.

Optic replaced yesterday and confirmed no more issues.

Wed, Oct 3, 3:55 PM · Wikimedia-Incident, Operations, netops
ayounsi closed T204782: cr2-ulsfo crash as Resolved.

Yep.

Wed, Oct 3, 3:25 PM · Patch-For-Review, netops, Operations, ops-ulsfo
ayounsi added a comment to T201039: connectivity issues between several hosts on asw2-b-eqiad.

Still good, here is the list of hosts currently on the new asw2-b-eqiad that will be impacted by Thursday 4th 16:00UTC 2h maintenance window (with a worse case of a 30min downtime for those hosts, and a best case of no impact). Will add the step by step changes needed shortly.

snapshot1008
wdqs1007
db1118
db1124
authdns1001
dbproxy1014
an-coord1001
cloudvirt1023 eth0
db1083
cloudvirt1023 eth1
ms-be-1022
db1112
kafka-jumbo1003
db1076
db1084
es1013
es1014
db1077
analytics1072
cp1079
cp1080
lvs1015:enp4s0f1
cloudelastic1002
db1099
ms-be1020
db1072
labvirt1015-eth0
labvirt1015-eth1
labnet1001:eth1
labnet1001:eth0
ms-be-1023
analytics1046 
analytics1047 
analytics1048 
analytics1049 
analytics1050 
analytics1051 
elastic1036
elastic1037
promethium
elastic1038
elastic1039
db1085
db1086
db1073
ms-be1031
db1104
logstash1005
maps1002
labnet1004 eth0
kafka1002
kubestage1002
iron
phab1001
labvirt1019:eth0
labvirt1021:eth0
lvs1004
lvs1005
lvs1006
ruthenium    
rhodium
elastic1049
prometheus1004
elastic1050
conf1005
labvirt1019:eth1
lvs1016:enp5s0f0 {#3931}
labvirt1021:eth1
ripe atlas
rdb1004
labnet1002:eth0
labnet1002:eth1
labnet1004 eth1
ms-be1032
ms-be1034
labweb1001
db1098
dbproxy1004
dbproxy1005
dbproxy1006
ms-be1016
ms-be1017
ms-be1018
mw1284
mw1285
mw1286
mw1287
mw1288
mw1289
mw1290       
thumbnor1001
thumbnor1002
mw1293
mw1294
mw1295
mw1296
mw1297
mw1298
mw1299
mw1300
mw1301
mw1302
mw1303
mw1304
mw1305
mw1306
elastic1046
elastic1047
elastic1028
kubernetes1002
aqs1008
wdqs1009
mc1024
mc1025
mc1026
mc1027
scb1002
lvs1001:eth1
lvs1002:eth1
lvs1003:eth1 
mw1313
mw1314
mw1315
mw1316
mw1317
mw1318
labcontrol1004
restbase-dev1005
cloudnet1003 eth0
ores1003
druid1005
labvirt1020 eth0
labvirt1020 eth1
analytics1073
labvirt1022:eth0
labvirt1022:eth1
ms-be1041
cloudnet1003 eth1
cp1081
cp1082
wtp1031
wtp1032
wtp1033
analytics1061 
analytics1062 
analytics1063 
wtp1034
wtp1035
ores1004
mwmaint1002  
cloudservices1003
db1113
db1119
notebook1003
graphite1004
dbproxy1015
an-master1002
cloudvirt1024 eth0
rdb1009
cloudvirt1024 eth1
Wed, Oct 3, 3:07 PM · Patch-For-Review, Operations, netops

Tue, Oct 2

ayounsi closed T204271: Grow frack-administration-codfw to /28 as Resolved.

An oversight prevented frbast2001 to reach eqiad:
codfw only advertised 10.195.0.0/25 to eqiad over ipsec.
Making it a /24 fixed the issue.

Tue, Oct 2, 9:43 PM · Patch-For-Review, Operations, fundraising-tech-ops, netops
ayounsi updated the task description for T204271: Grow frack-administration-codfw to /28.
Tue, Oct 2, 9:41 PM · Patch-For-Review, Operations, fundraising-tech-ops, netops
ayounsi updated the task description for T204271: Grow frack-administration-codfw to /28.
Tue, Oct 2, 6:48 PM · Patch-For-Review, Operations, fundraising-tech-ops, netops
ayounsi updated the task description for T204271: Grow frack-administration-codfw to /28.
Tue, Oct 2, 5:40 PM · Patch-For-Review, Operations, fundraising-tech-ops, netops
ayounsi updated the task description for T204271: Grow frack-administration-codfw to /28.
Tue, Oct 2, 5:26 PM · Patch-For-Review, Operations, fundraising-tech-ops, netops
aborrero awarded T205897: Netbox: fill network topology a Love token.
Tue, Oct 2, 5:05 PM · Operations
ayounsi updated the task description for T204271: Grow frack-administration-codfw to /28.
Tue, Oct 2, 4:49 PM · Patch-For-Review, Operations, fundraising-tech-ops, netops
ayounsi updated the task description for T205985: Renumber office-DC interconnect link.
Tue, Oct 2, 4:22 PM · Patch-For-Review, Operations, netops
ayounsi triaged T205985: Renumber office-DC interconnect link as High priority.
Tue, Oct 2, 4:15 PM · Patch-For-Review, Operations, netops

Mon, Oct 1

ayounsi triaged T205937: Interface errors on cr4-ulsfo:et-0/0/1 as Normal priority.
Mon, Oct 1, 11:11 PM · Operations, netops, ops-ulsfo
ayounsi added a comment to T201145: asw2-a-eqiad FPC5 gets disconnected every 10 minutes.

The logs mentioned during the meeting seem to be the link between a2 and a8 flapping (possibly faulty optic) and VC members re-calculating paths around the failure:

Oct  1 19:33:07  asw2-a-eqiad rpd[2040]: EVENT <UpDown> vcp-255/0/52 index 132 <Broadcast Multicast>
Oct  1 19:33:07  asw2-a-eqiad vccpd[1868]: interface vcp-255/0/52 went down
Oct  1 19:33:07  asw2-a-eqiad vccpd[1868]: Member 2, interface vcp-255/0/52.32768 went down
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfex_vc_get_rem_memb_nh:175 no valid port in vc_trunk 2
Oct  1 19:33:07  asw2-a-eqiad fpc2 no valid port in vc_trunk 2
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfex_vc_get_rem_memb_nh:175 no valid port in vc_trunk 2
Oct  1 19:33:07  asw2-a-eqiad fpc2 no valid port in vc_trunk 2
Oct  1 19:33:07  asw2-a-eqiad vccpd[1868]: JTASK_SIGNAL_UNKNOWN: Ignoring unknown signal SIGVTALRM (26)
Oct  1 19:33:07  asw2-a-eqiad fpc7 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 1 selfId 7 devrt_raw 0x20 0x2000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc4 pfe_bcm_release_trunk:902 Releasing trunk 2 ref count (5)
Oct  1 19:33:07  asw2-a-eqiad fpc4 pfe_bcm_vchassis_stk_modport_trunk_set:1665 dev 0 ingress_pbm 0 1ff ffffffff ffffffff dest_mod: 8 bcm_trunk_id: 1024
Oct  1 19:33:07  asw2-a-eqiad fpc7 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 2 selfId 7 devrt_raw 0x20 0x2000000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc7 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 3 selfId 7 devrt_raw 0x20 0x20000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc7 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 4 selfId 7 devrt_raw 0x20 0x200000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc7 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 5 selfId 7 devrt_raw 0x20 0x2000000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc7 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 6 selfId 7 devrt_raw 0x22 0x0 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc7 pfe_bcm_vchassis_multi_path_irt_process:3270 processing MC: map = 3f, flag=0, num_node = 8, node_map = 1fe
Oct  1 19:33:07  asw2-a-eqiad fpc7 pfe_bcm_vchassis_multi_path_irt_process:3277 tree process time: UC = 744, MC = 6390, total = 7134 us
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 1 selfId 2 devrt_raw 0x20 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 2 selfId 2 devrt_raw 0x20 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 3 selfId 2 devrt_raw 0x20 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 4 selfId 2 devrt_raw 0x20 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 5 selfId 2 devrt_raw 0x20 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 6 selfId 2 devrt_raw 0x20 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_release_trunk:902 Releasing trunk 6 ref count (1)
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_delete_trunk:872 Deleting trunk 6 of type 2
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_vchassis_stk_modport_trunk_set:1665 dev 0 ingress_pbm 0 1ff ffffffff ffffffff dest_mod: 7 bcm_trunk_id: 1030
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_release_trunk:902 Releasing trunk 2 ref count (1)
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_delete_trunk:872 Deleting trunk 2 of type 1
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_vchassis_stk_modport_trunk_set:1665 dev 0 ingress_pbm 0 1ff ffffffff ffffffff dest_mod: 8 bcm_trunk_id: 1030
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 8 selfId 2 devrt_raw 0x20 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_vchassis_block_unused_hg_ports:485 unused hg blocked = 0, used hg = 2222000000000000
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_vchassis_multi_path_irt_process:3270 processing MC: map = bf, flag=0, num_node = 8, node_map = 1fe
Oct  1 19:33:07  asw2-a-eqiad fpc2 pfe_bcm_vchassis_multi_path_irt_process:3277 tree process time: UC = 19024, MC = 1668, total = 20692 us
Oct  1 19:33:08  asw2-a-eqiad fpc2 UPDN msg to kernel for ifd:vcp-255/0/52, flag:1, speed: 40000000000, duplex:2
Oct  1 19:33:08  asw2-a-eqiad mcsnoopd[2065]: EVENT <UpDown> vcp-255/0/52.32768 index 68 <Up Broadcast Multicast>
Oct  1 19:33:08  asw2-a-eqiad vccpd[1868]: VCCPD_PROTOCOL_ADJUP: New adjacency to c042.d045.7ac0 on vcp-255/0/52.32768
Oct  1 19:33:08  asw2-a-eqiad mcsnoopd[2065]: EVENT <UpDown> vcp-255/0/52 index 132 <Up Broadcast Multicast>
Oct  1 19:33:08  asw2-a-eqiad rpd[2040]: EVENT <UpDown> vcp-255/0/52.32768 index 68 <Up Broadcast Multicast>
Oct  1 19:33:08  asw2-a-eqiad rpd[2040]: EVENT <UpDown> vcp-255/0/52 index 132 <Up Broadcast Multicast>
Oct  1 19:33:08  asw2-a-eqiad vccpd[1868]: interface vcp-255/0/52 came up
Oct  1 19:33:08  asw2-a-eqiad vccpd[1868]: Member 2, interface vcp-255/0/52.32768 came up
Oct  1 19:33:08  asw2-a-eqiad fpc7 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 1 selfId 7 devrt_raw 0x0 0x2000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad vccpd[1868]: JTASK_SIGNAL_UNKNOWN: Ignoring unknown signal SIGVTALRM (26)
Oct  1 19:33:08  asw2-a-eqiad fpc7 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 2 selfId 7 devrt_raw 0x0 0x2000000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc7 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 3 selfId 7 devrt_raw 0x0 0x20000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc7 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 4 selfId 7 devrt_raw 0x0 0x200000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc7 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 5 selfId 7 devrt_raw 0x0 0x2000000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc7 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 6 selfId 7 devrt_raw 0x2 0x0 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc7 pfe_bcm_vchassis_multi_path_irt_process:3270 processing MC: map = 3f, flag=0, num_node = 8, node_map = 1fe
Oct  1 19:33:08  asw2-a-eqiad fpc7 pfe_bcm_vchassis_multi_path_irt_process:3277 tree process time: UC = 770, MC = 1702, total = 2472 us
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 1 selfId 2 devrt_raw 0x22 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 2 selfId 2 devrt_raw 0x22 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 3 selfId 2 devrt_raw 0x22 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 4 selfId 2 devrt_raw 0x22 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 5 selfId 2 devrt_raw 0x22 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 6 selfId 2 devrt_raw 0x22 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_release_trunk:902 Releasing trunk 6 ref count (2)
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_vchassis_stk_modport_trunk_set:1665 dev 0 ingress_pbm 0 1ff ffffffff ffffffff dest_mod: 7 bcm_trunk_id: 1026
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_vchassis_block_unused_hg_ports:485 unused hg blocked = 0, used hg = 2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_release_trunk:902 Releasing trunk 6 ref count (1)
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_delete_trunk:872 Deleting trunk 6 of type 2
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_vchassis_stk_modport_trunk_set:1665 dev 0 ingress_pbm 0 1ff ffffffff ffffffff dest_mod: 8 bcm_trunk_id: 1030
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_vchassis_port_modid_egress_set:565 egress set: vc_tree_mode 1, memb 8 selfId 2 devrt_raw 0x22 0x2222000000000000 vcp_mask 0x22, 0x2222000000000000
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_vchassis_multi_path_irt_process:3270 processing MC: map = bf, flag=0, num_node = 8, node_map = 1fe
Oct  1 19:33:08  asw2-a-eqiad fpc2 pfe_bcm_vchassis_multi_path_irt_process:3277 tree process time: UC = 19669, MC = 1728, total = 21397 us
Oct  1 19:33:08  asw2-a-eqiad fpc4 pfe_bcm_release_trunk:902 Releasing trunk 0 ref count (2)
Oct  1 19:33:08  asw2-a-eqiad fpc4 pfe_bcm_vchassis_stk_modport_trunk_set:1665 dev 0 ingress_pbm 0 1ff ffffffff ffffffff dest_mod: 8 bcm_trunk_id: 1026

@Cmjohnson, can you replace the optic on asw2-a2-eqiad:et-0/0/52, then on asw2-a8-eqiad:et-0/1/1 if still ongoing?

Mon, Oct 1, 9:17 PM · Wikimedia-Incident, Operations, netops
ayounsi claimed T205898: Netbox: explore NAPALM integration.
Mon, Oct 1, 9:02 PM · Patch-For-Review, Operations
ayounsi claimed T205897: Netbox: fill network topology.
Mon, Oct 1, 9:02 PM · Operations

Thu, Sep 27

ayounsi closed T205513: Enable cumin1001 in router ACLs as Resolved.

All filters updated, let me know if any issues.

Thu, Sep 27, 7:33 PM · Operations, netops
ayounsi added a comment to T201139: Intermittent connectivity issues in eqiad's row C.

Thanks for the update, note that es1014 is in row B (issues tracked in T201039)
Rough timeline is to get row B fixed next week, no ETA yet for row C (but I'm not aware of ongoing issues with row C).

Thu, Sep 27, 6:24 PM · netops, Operations
ayounsi updated the task description for T189552: Rack/cable/configure ulsfo MX204.
Thu, Sep 27, 12:24 AM · Patch-For-Review, Operations, ops-ulsfo, netops, Traffic

Wed, Sep 26

ayounsi updated the task description for T189552: Rack/cable/configure ulsfo MX204.
Wed, Sep 26, 11:28 PM · Patch-For-Review, Operations, ops-ulsfo, netops, Traffic
ayounsi updated the task description for T189552: Rack/cable/configure ulsfo MX204.
Wed, Sep 26, 10:01 PM · Patch-For-Review, Operations, ops-ulsfo, netops, Traffic

Mon, Sep 24

ayounsi closed T205340: analytics1-a VLAN has no DNS for gateway addresses to match other analytics VLANs as Resolved.

Fixed:

ayounsi@bast1002:~$ host 10.64.5.1
1.5.64.10.in-addr.arpa domain name pointer vrrp-gw-1030.eqiad.wmnet.
ayounsi@bast1002:~$ host 10.64.5.2
2.5.64.10.in-addr.arpa domain name pointer ae1-1030.cr1-eqiad.wikimedia.org.
ayounsi@bast1002:~$ host 10.64.5.3
3.5.64.10.in-addr.arpa domain name pointer ae1-1030.cr2-eqiad.wikimedia.org.
Mon, Sep 24, 11:28 PM · Patch-For-Review, Operations, netops
ayounsi claimed T205340: analytics1-a VLAN has no DNS for gateway addresses to match other analytics VLANs.
Mon, Sep 24, 11:20 PM · Patch-For-Review, Operations, netops
ayounsi updated the task description for T189552: Rack/cable/configure ulsfo MX204.
Mon, Sep 24, 10:55 PM · Patch-For-Review, Operations, ops-ulsfo, netops, Traffic

Thu, Sep 20

ayounsi updated the task description for T204170: Rack/setup cr2-eqord.
Thu, Sep 20, 5:14 PM · netops, Operations, ops-eqiad

Wed, Sep 19

ayounsi updated the task description for T189552: Rack/cable/configure ulsfo MX204.
Wed, Sep 19, 8:08 PM · Patch-For-Review, Operations, ops-ulsfo, netops, Traffic
ayounsi updated the task description for T189552: Rack/cable/configure ulsfo MX204.
Wed, Sep 19, 8:06 PM · Patch-For-Review, Operations, ops-ulsfo, netops, Traffic
ayounsi updated the task description for T189552: Rack/cable/configure ulsfo MX204.
Wed, Sep 19, 5:53 PM · Patch-For-Review, Operations, ops-ulsfo, netops, Traffic
ayounsi updated the task description for T189552: Rack/cable/configure ulsfo MX204.
Wed, Sep 19, 4:53 PM · Patch-For-Review, Operations, ops-ulsfo, netops, Traffic

Tue, Sep 18

ayounsi updated the task description for T204782: cr2-ulsfo crash.
Tue, Sep 18, 11:33 PM · Patch-For-Review, netops, Operations, ops-ulsfo
ayounsi triaged T204782: cr2-ulsfo crash as High priority.
Tue, Sep 18, 10:24 PM · Patch-For-Review, netops, Operations, ops-ulsfo
ayounsi triaged T204743: Ensure scs-c1-eqiad:eth1 is not connected as Low priority.
Tue, Sep 18, 5:57 PM · netops, Operations, ops-eqiad
ayounsi closed T204730: Enable cumin2001 in router ACLs as Resolved.

Done!

Tue, Sep 18, 4:23 PM · Operations, netops

Mon, Sep 17

ayounsi updated the task description for T204170: Rack/setup cr2-eqord.
Mon, Sep 17, 8:30 PM · netops, Operations, ops-eqiad
ayounsi updated the task description for T204170: Rack/setup cr2-eqord.
Mon, Sep 17, 7:40 PM · netops, Operations, ops-eqiad
ayounsi reassigned T204377: LDAP Acess request for Margeigh Novotny from ayounsi to Nuria.

back to Nuria for a wikitech username. Once I have one I can add it to the wmf group.

Mon, Sep 17, 6:52 PM · Patch-For-Review, LDAP-Access-Requests
ayounsi closed T204382: wmf group access for SBassett as Resolved.

You should be good to go, please reopen if not.

Mon, Sep 17, 6:46 PM · Patch-For-Review, Security-Team, LDAP-Access-Requests
ayounsi claimed T204377: LDAP Acess request for Margeigh Novotny.
Mon, Sep 17, 3:28 PM · Patch-For-Review, LDAP-Access-Requests
ayounsi claimed T204382: wmf group access for SBassett.
Mon, Sep 17, 3:28 PM · Patch-For-Review, Security-Team, LDAP-Access-Requests

Sep 14 2018

ayounsi added a comment to T202486: Requesting access to restricted production access and analytics-privatedata-users for Kalliope Tsouroupidou.

Update, @Kalliope let us know if you're all set.

Sep 14 2018, 5:57 PM · Patch-For-Review, Operations, SRE-Access-Requests

Sep 13 2018

ayounsi triaged T204281: Stop prioritizing peering over transit as Normal priority.
Sep 13 2018, 9:32 PM · Performance-Team (Radar), netops, Operations
ayounsi triaged T204271: Grow frack-administration-codfw to /28 as Normal priority.
Sep 13 2018, 6:27 PM · Patch-For-Review, Operations, fundraising-tech-ops, netops
ayounsi closed T203719: Interface errors on cr2-eqiad:xe-4/0/0 as Resolved.

Lot better, thanks!

Sep 13 2018, 5:44 PM · Operations, netops, ops-eqiad
ayounsi closed T203847: Requesting access to researchers for kharlan as Resolved.
notebook1003:~$ id kharlan
uid=19582(kharlan) gid=500(wikidev) groups=500(wikidev),714(researchers)
Sep 13 2018, 5:21 PM · Patch-For-Review, Operations, SRE-Access-Requests
ayounsi added a comment to T204079: move/setup/install frauth2001.frack.codfw.wmnet.

10.195.0.73 is the router IP, it's missing from DNS, I'll add it (and the other ones).
And indeed, it can't be extended to a /28.

Sep 13 2018, 12:20 AM · Patch-For-Review, ops-codfw, fundraising-tech-ops, Operations

Sep 12 2018

ayounsi updated the task description for T203847: Requesting access to researchers for kharlan.
Sep 12 2018, 11:58 PM · Patch-For-Review, Operations, SRE-Access-Requests
ayounsi moved T203847: Requesting access to researchers for kharlan from Manager/NDA Approval/Confimation to 3 Business Day Wait on the SRE-Access-Requests board.
Sep 12 2018, 11:56 PM · Patch-For-Review, Operations, SRE-Access-Requests
ayounsi updated the task description for T203847: Requesting access to researchers for kharlan.
Sep 12 2018, 11:53 PM · Patch-For-Review, Operations, SRE-Access-Requests
ayounsi claimed T203847: Requesting access to researchers for kharlan.

@kostajh could you please sign https://phabricator.wikimedia.org/L3 ?
Edit, nevermind, I see that you signed it.
Next step is the 3 business day wait period, so will merge the patch tomorrow if no complaints.

Sep 12 2018, 11:51 PM · Patch-For-Review, Operations, SRE-Access-Requests
ayounsi triaged T204170: Rack/setup cr2-eqord as Normal priority.
Sep 12 2018, 10:24 PM · netops, ops-eqiad, Operations

Sep 11 2018

ayounsi updated the task description for T204079: move/setup/install frauth2001.frack.codfw.wmnet.
Sep 11 2018, 6:18 PM · Patch-For-Review, ops-codfw, fundraising-tech-ops, Operations
ayounsi updated the task description for T204079: move/setup/install frauth2001.frack.codfw.wmnet.
Sep 11 2018, 6:12 PM · Patch-For-Review, ops-codfw, fundraising-tech-ops, Operations
ayounsi awarded T197873: how to structure wiki pages for Icinga reaction play books a Like token.
Sep 11 2018, 4:53 PM · Patch-For-Review, monitoring, Operations

Sep 10 2018

ayounsi added a comment to T201139: Intermittent connectivity issues in eqiad's row C.

I looked at it some time ago, the spike of DDOS_PROTOCOL_VIOLATION matches spikes of broadcast/multicast traffic we observed on asw2-a

Sep 10 2018, 8:45 PM · netops, Operations
ayounsi reassigned T202700: unrack/decom cr1-eqdfw from ayounsi to Papaul.
Sep 10 2018, 6:00 PM · ops-eqdfw, Operations
ayounsi added a comment to T202636: Allow routing between eqiad and eqiad1 regions.

(Please, confirm v1102 is [labs|cloud]-instances1-b-eqiad with addressing 10.68.16.0/21).

Sep 10 2018, 5:40 PM · Patch-For-Review, Cloud-VPS, cloud-services-team

Sep 7 2018

ayounsi added a comment to T201145: asw2-a-eqiad FPC5 gets disconnected every 10 minutes.

Cabling has been done out of order, but end result is there. (minus the 7m DAC).

Sep 7 2018, 7:37 PM · Wikimedia-Incident, Operations, netops
ayounsi claimed T190424: modify labs-hosts1-vlans for http load of installer kernel.
Sep 7 2018, 4:45 PM · Patch-For-Review, cloud-services-team, Operations, netops
ayounsi closed T190424: modify labs-hosts1-vlans for http load of installer kernel as Resolved.

Thanks, this has been useful, especially running a packet capture on the working vs. non working flows.

Sep 7 2018, 4:45 PM · Patch-For-Review, cloud-services-team, Operations, netops
ayounsi added a comment to T201097: Add virtual chassis port status alerting.

Putting the script here the time I send a Gerrit CR.
It uses snimpy and the required MIBs can be obtained on https://apps.juniper.net/mib-explorer/index.jsp

Sep 7 2018, 4:06 PM · Patch-For-Review, monitoring, Operations, netops