Page MenuHomePhabricator

Telia ulsfo-eqord transport link down
Closed, ResolvedPublic

Description

https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=cr2-eqord&service=Router+interfaces

CRITICAL: host '208.80.154.198', interfaces up: 44, down: 1, dormant: 0, excluded: 0, unused: 0:
xe-0/1/3: down -> Transport: cr3-ulsfo:xe-0/1/1 (Telia,xxx, 51ms 10Gbps wave) {#11372};
https://wikitech.wikimedia.org/wiki/Network_monitoring#Router_interface_down

Not a planned maintenance.

Event Timeline

ayounsi triaged this task as High priority.Aug 11 2022, 7:57 AM
ayounsi created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Automated diagnostic for Netbox interface ID cr3-ulsfo:xe-0/1/1

Interface cr3-ulsfo:xe-0/1/1

  • admin-status: up
  • ⚠️ oper-status: down
  • interface-flapped: 2022-08-11 06:49:13 UTC (01:08:37 ago)
  • ⚠️ errors: {'carrier-transitions': 372}
  • laser-output-power: 0.7030
  • laser-output-power-dbm: -1.53
  • rx-signal-avg-optical-power: 0.0001
  • ⚠️ rx-signal-avg-optical-power-dbm: -40.00
Logs for cr3-ulsfo.wikimedia.org:xe-0/1/1
Aug 11 06:49:13  cr3-ulsfo rpd[16292]: krt unsolic client: Received IPv6 address fe80::f24b:3aff:feef:7e44 on ifl xe-0/1/1.0. Flag:2.
Aug 11 06:49:13  cr3-ulsfo rpd[16292]: STP handler: IFD=xe-0/1/1, op=change, state=Discarding, Topo change generation=0
Aug 11 06:49:13  cr3-ulsfo rpd[16292]: RPD_OSPF_NBRDOWN: OSPF neighbor 198.35.26.209 (realm ospf-v2 xe-0/1/1.0 area 0.0.0.0) state changed from Full to Down due to KillNbr (event reason: interface went down)
Aug 11 06:49:13  cr3-ulsfo rpd[16292]: RPD_OSPF_NBRDOWN: OSPF neighbor fe80::ee38:73ff:fee7:bc66 (realm ipv6-unicast xe-0/1/1.0 area 0.0.0.0) state changed from Full to Down due to KillNbr (event reason: interface went down)
Aug 11 06:49:13  cr3-ulsfo bfdd[15985]: BFDD_TRAP_SHOP_STATE_DOWN: local discriminator: 246, new state: down, interface: xe-0/1/1.0, peer addr: 198.35.26.209
Aug 11 06:49:13  cr3-ulsfo bfdd[15985]: BFDD_TRAP_SHOP_STATE_DOWN: local discriminator: 247, new state: down, interface: xe-0/1/1.0, peer addr: fe80::ee38:73ff:fee7:bc66
Aug 11 06:49:13  cr3-ulsfo rpd[16292]: STP handler: IFD=xe-0/1/1, op=change, state=Discarding, Topo change generation=0
Aug 11 06:49:13  cr3-ulsfo mib2d[15968]: SNMP_TRAP_LINK_DOWN: ifIndex 535, ifAdminStatus up(1), ifOperStatus down(2), ifName xe-0/1/1
Aug 11 06:51:10  cr3-ulsfo l2cpd[16315]: LLDP_NEIGHBOR_DOWN: A neighbor of interface xe-0/1/1 has gone down. Now, this interface has 0 neighbor/s.

Interface cr2-eqord:xe-0/1/3

  • admin-status: up
  • ⚠️ oper-status: down
  • interface-flapped: 2022-08-11 06:49:13 UTC (01:08:44 ago)
  • ⚠️ errors: {'carrier-transitions': 110}
  • laser-output-power: 0.7090
  • laser-output-power-dbm: -1.49
  • rx-signal-avg-optical-power: 0.3745
  • rx-signal-avg-optical-power-dbm: -4.27
Logs for cr2-eqord.wikimedia.org:xe-0/1/3
Aug 11 06:49:13  cr2-eqord rpd[13269]: STP handler: IFD=xe-0/1/3, op=change, state=Discarding, Topo change generation=0
Aug 11 06:49:13  cr2-eqord rpd[13269]: RPD_OSPF_NBRDOWN: OSPF neighbor 198.35.26.208 (realm ospf-v2 xe-0/1/3.0 area 0.0.0.0) state changed from Full to Down due to KillNbr (event reason: interface went down)
Aug 11 06:49:13  cr2-eqord rpd[13269]: RPD_OSPF_NBRDOWN: OSPF neighbor fe80::f24b:3aff:feef:7e44 (realm ipv6-unicast xe-0/1/3.0 area 0.0.0.0) state changed from Full to Down due to KillNbr (event reason: interface went down)
Aug 11 06:49:13  cr2-eqord bfdd[13177]: BFDD_TRAP_SHOP_STATE_DOWN: local discriminator: 125, new state: down, interface: xe-0/1/3.0, peer addr: 198.35.26.208
Aug 11 06:49:13  cr2-eqord bfdd[13177]: BFDD_TRAP_SHOP_STATE_DOWN: local discriminator: 126, new state: down, interface: xe-0/1/3.0, peer addr: fe80::f24b:3aff:feef:7e44
Aug 11 06:49:13  cr2-eqord mib2d[13166]: SNMP_TRAP_LINK_DOWN: ifIndex 536, ifAdminStatus up(1), ifOperStatus down(2), ifName xe-0/1/3
Aug 11 06:49:13  cr2-eqord rpd[13269]: STP handler: IFD=xe-0/1/3, op=change, state=Discarding, Topo change generation=0
Aug 11 06:51:01  cr2-eqord l2cpd[13292]: LLDP_NEIGHBOR_DOWN: A neighbor of interface xe-0/1/3 has gone down. Now, this interface has 0 neighbor/s.
Aug 11 07:42:36  cr2-eqord fpc0 smic_phy_dfe_tuning_state: xe-0/1/3 - DFE coarse/fine tuning completed (took 5007 ms); enabling DFE adaptive tuning.
Automated diagnostic for Netbox interface ID cr3-ulsfo:xe-0/1/1

Interface cr3-ulsfo:xe-0/1/1

  • admin-status: up
  • oper-status: up
  • interface-flapped: 2022-08-11 07:58:25 UTC (00:07:45 ago)
  • ⚠️ errors: {'carrier-transitions': 1}
  • laser-output-power: 0.7030
  • laser-output-power-dbm: -1.53
  • rx-signal-avg-optical-power: 0.3805
  • rx-signal-avg-optical-power-dbm: -4.20
Logs for cr3-ulsfo.wikimedia.org:xe-0/1/1
Aug 11 07:58:25  cr3-ulsfo rpd[16292]: RPD_OSPF_NBRUP: OSPF neighbor 198.35.26.209 (realm ospf-v2 xe-0/1/1.0 area 0.0.0.0) state changed from Init to ExStart due to 2WayRcvd (event reason: neighbor detected this router)
Aug 11 07:58:25  cr3-ulsfo rpd[16292]: RPD_OSPF_NBRUP: OSPF neighbor 198.35.26.209 (realm ospf-v2 xe-0/1/1.0 area 0.0.0.0) state changed from Exchange to Full due to ExchangeDone (event reason: DBD exchange of slave completed)
Aug 11 07:58:34  cr3-ulsfo bfdd[15985]: BFDD_TRAP_SHOP_STATE_UP: local discriminator: 248, new state: up, interface: xe-0/1/1.0, peer addr: 198.35.26.209
Aug 11 07:58:36  cr3-ulsfo rpd[16292]: krt unsolic client: Received IPv6 address 2620:0:863:fe02::1 on ifl xe-0/1/1.0. Flag:0.
Aug 11 07:58:36  cr3-ulsfo rpd[16292]: krt unsolic client: Received IPv6 address fe80::f24b:3aff:feef:7e44 on ifl xe-0/1/1.0. Flag:0.
Aug 11 07:58:36  cr3-ulsfo rpd[16292]: RPD_OSPF_NBRUP: OSPF neighbor fe80::ee38:73ff:fee7:bc66 (realm ipv6-unicast xe-0/1/1.0 area 0.0.0.0) state changed from Init to ExStart due to 2WayRcvd (event reason: neighbor detected this router)
Aug 11 07:58:41  cr3-ulsfo rpd[16292]: RPD_OSPF_NBRUP: OSPF neighbor fe80::ee38:73ff:fee7:bc66 (realm ipv6-unicast xe-0/1/1.0 area 0.0.0.0) state changed from Loading to Full due to LoadDone (event reason: OSPF loading completed)
Aug 11 07:58:43  cr3-ulsfo bfdd[15985]: BFDD_TRAP_SHOP_STATE_UP: local discriminator: 249, new state: up, interface: xe-0/1/1.0, peer addr: fe80::ee38:73ff:fee7:bc66
Aug 11 07:58:53  cr3-ulsfo l2cpd[16315]: LLDP_NEIGHBOR_UP: A neighbor has come up for interface xe-0/1/1. Now, this interface has 1 neighbor/s .

Interface cr2-eqord:xe-0/1/3

  • admin-status: up
  • oper-status: up
  • interface-flapped: 2022-08-11 07:58:22 UTC (00:07:54 ago)
  • ⚠️ errors: {'carrier-transitions': 3}
  • laser-output-power: 0.7120
  • laser-output-power-dbm: -1.48
  • rx-signal-avg-optical-power: 0.3652
  • rx-signal-avg-optical-power-dbm: -4.37
Logs for cr2-eqord.wikimedia.org:xe-0/1/3
Aug 11 07:58:25  cr2-eqord l2cpd[13292]: LLDP_NEIGHBOR_UP: A neighbor has come up for interface xe-0/1/3. Now, this interface has 1 neighbor/s .
Aug 11 07:58:25  cr2-eqord rpd[13269]: RPD_OSPF_NBRUP: OSPF neighbor 198.35.26.208 (realm ospf-v2 xe-0/1/3.0 area 0.0.0.0) state changed from Init to ExStart due to 2WayRcvd (event reason: neighbor detected this router)
Aug 11 07:58:25  cr2-eqord rpd[13269]: RPD_OSPF_NBRUP: OSPF neighbor 198.35.26.208 (realm ospf-v2 xe-0/1/3.0 area 0.0.0.0) state changed from Exchange to Full due to ExchangeDone (event reason: DBD exchange of master completed)
Aug 11 07:58:32  cr2-eqord bfdd[13177]: BFDD_TRAP_SHOP_STATE_UP: local discriminator: 127, new state: up, interface: xe-0/1/3.0, peer addr: 198.35.26.208
Aug 11 07:58:33  cr2-eqord rpd[13269]: krt unsolic client: Received IPv6 address 2620:0:863:fe02::2 on ifl xe-0/1/3.0. Flag:0.
Aug 11 07:58:33  cr2-eqord rpd[13269]: krt unsolic client: Received IPv6 address fe80::ee38:73ff:fee7:bc66 on ifl xe-0/1/3.0. Flag:0.
Aug 11 07:58:36  cr2-eqord rpd[13269]: RPD_OSPF_NBRUP: OSPF neighbor fe80::f24b:3aff:feef:7e44 (realm ipv6-unicast xe-0/1/3.0 area 0.0.0.0) state changed from Init to ExStart due to 2WayRcvd (event reason: neighbor detected this router)
Aug 11 07:58:41  cr2-eqord rpd[13269]: RPD_OSPF_NBRUP: OSPF neighbor fe80::f24b:3aff:feef:7e44 (realm ipv6-unicast xe-0/1/3.0 area 0.0.0.0) state changed from Loading to Full due to LoadDone (event reason: OSPF loading completed)
Aug 11 07:58:43  cr2-eqord bfdd[13177]: BFDD_TRAP_SHOP_STATE_UP: local discriminator: 128, new state: up, interface: xe-0/1/3.0, peer addr: fe80::f24b:3aff:feef:7e44

And of course it went back up as I'm sending the email.

Also got a quick reply from Telia:

Please be informed that your circuit is affected by a Major Disturbance being tracked under REF 01439389 - The root cause is under investigation, more to follow once available.

ayounsi claimed this task.