14h ago those 2 interfaces:
xe-5/3/0 down down Core: cr2-codfw:xe-5/3/0
xe-5/3/1 down down Transit: Telia
Looking at the almost rolled over logs I was able to find:
cr1-codfw> show log messages.8.gz | match xe-5/3/1 Jun 1 16:24:27 re0.cr1-codfw MGMT:rpd[5491]: EVENT <UpDown> xe-5/3/1 index 216 <Broadcast Multicast> address #0 64.87.88.f2.73.69 Jun 1 16:24:27 re0.cr1-codfw MGMT:rpd[5491]: STP handler: IFD=xe-5/3/1, op=change, state=Discarding, Topo change generation=0 Jun 1 16:24:27 re0.cr1-codfw fpc5 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-5/3/1 216 Jun 1 16:24:27 re0.cr1-codfw MGMT:rpd[5491]: STP handler: IFD=xe-5/3/1, op=change, state=Discarding, Topo change generation=0 Jun 1 16:24:27 re0.cr1-codfw rpd[5452]: EVENT <UpDown> xe-5/3/1.0 index 389 <Broadcast Multicast> address #0 64.87.88.f2.73.69 Jun 1 16:24:27 re0.cr1-codfw rpd[5452]: EVENT UpDown xe-5/3/1.0 index 389 80.239.192.102/30 -> 80.239.192.103 <Broadcast Multicast Localup> Jun 1 16:24:27 re0.cr1-codfw rpd[5452]: EVENT UpDown xe-5/3/1.0 index 389 2001:2000:3080:af4::2/64 -> zero-len <Broadcast Multicast Localup> Jun 1 16:24:27 re0.cr1-codfw rpd[5452]: EVENT UpDown xe-5/3/1.0 index 389 fe80::6687:88ff:fef2:7369/64 -> zero-len <Broadcast Multicast Localup> Jun 1 16:24:27 re0.cr1-codfw rpd[5452]: EVENT <UpDown> xe-5/3/1 index 216 <Broadcast Multicast> address #0 64.87.88.f2.73.69 Jun 1 16:24:27 re0.cr1-codfw rpd[5452]: krt unsolic client: Received IPv6 address 2001:2000:3080:af4::2 on ifl xe-5/3/1.0. Flag:2. Jun 1 16:24:27 re0.cr1-codfw rpd[5452]: krt unsolic client: Received IPv6 address fe80::6687:88ff:fef2:7369 on ifl xe-5/3/1.0. Flag:2. Jun 1 16:24:27 re0.cr1-codfw rpd[5452]: STP handler: IFD=xe-5/3/1, op=change, state=Discarding, Topo change generation=0 Jun 1 16:24:27 re0.cr1-codfw mib2d[4968]: SNMP_TRAP_LINK_DOWN: ifIndex 537, ifAdminStatus up(1), ifOperStatus down(2), ifName xe-5/3/1 Jun 1 16:24:27 re0.cr1-codfw kernel: if_msg_ifd_cmd_tlv_decode ifd xe-5/3/1 #216 down with ASIC Error Jun 1 16:24:30 re0.cr1-codfw fpc5 IFFPC: IFD(xe-5/3/1, 216) ASIC error notification
re0.cr1-codfw> show log messages.8.gz | match fpc5 Jun 1 16:23:59 re0.cr1-codfw fpc5 MQCHIP(3) WI upoh flow control exception Jun 1 16:24:12 re0.cr1-codfw fpc5 jnh_update_ifd_standby_state IFD: 215, Enable: False Jun 1 16:24:14 re0.cr1-codfw fpc5 Error (0x10409), module: TOE-MQ-3:0:0, type: MQ_TOE TX Blocked Major error Jun 1 16:24:27 re0.cr1-codfw fpc5 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-5/3/0 215 Jun 1 16:24:27 re0.cr1-codfw fpc5 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-5/3/1 216 Jun 1 16:24:27 re0.cr1-codfw fpc5 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-5/3/2 217 Jun 1 16:24:27 re0.cr1-codfw fpc5 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-5/3/3 218 Jun 1 16:24:30 re0.cr1-codfw fpc5 Cmerror Op Set: TOE-MQ-3:0:0: TOE MQ.3.0.0 : SetErr - ** WEDGE DETECTED IN PFE 3 stream 0 TOE host packet transfer: TOE toAsic path blocked (code 0x9) Jun 1 16:24:30 re0.cr1-codfw fpc5 IFFPC: IFD(xe-5/3/0, 215) ASIC error notification Jun 1 16:24:30 re0.cr1-codfw fpc5 IFFPC: IFD(xe-5/3/1, 216) ASIC error notification Jun 1 16:24:30 re0.cr1-codfw fpc5 IFFPC: IFD(xe-5/3/2, 217) ASIC error notification Jun 1 16:24:30 re0.cr1-codfw fpc5 IFFPC: IFD(xe-5/3/3, 218) ASIC error notification Jun 1 16:24:31 re0.cr1-codfw fpc5 MIC(5/3) link 0 SFP laser bias current low alarm set Jun 1 16:24:31 re0.cr1-codfw fpc5 MIC(5/3) link 0 SFP output power low alarm set Jun 1 16:24:31 re0.cr1-codfw fpc5 MIC(5/3) link 0 SFP laser bias current low warning set Jun 1 16:24:31 re0.cr1-codfw fpc5 MIC(5/3) link 0 SFP output power low warning set Jun 1 16:24:31 re0.cr1-codfw fpc5 MIC(5/3) link 1 SFP laser bias current low alarm set Jun 1 16:24:31 re0.cr1-codfw fpc5 MIC(5/3) link 1 SFP output power low alarm set Jun 1 16:24:31 re0.cr1-codfw fpc5 MIC(5/3) link 1 SFP laser bias current low warning set Jun 1 16:24:31 re0.cr1-codfw fpc5 MIC(5/3) link 1 SFP output power low warning set Jun 1 16:24:32 re0.cr1-codfw fpc5 Error (0x20004), module: Host Loopback, type: Host Loopback Path Id 3 Jun 1 16:24:35 re0.cr1-codfw fpc5 Cmerror Op Set: Host Loopback: HOST LOOPBACK WEDGE DETECTED IN PATH ID 3 Jun 1 16:24:36 re0.cr1-codfw fpc5 PFE[3] Liveness Thread Stopped, interval = 0 Jun 1 16:24:37 re0.cr1-codfw fpc5 PFE[3] CC[0] Fabric Probe Stopped, interval = 50 ms
And indeed:
cr1-codfw> show system alarms 2 alarms currently active Alarm time Class Description 2020-06-01 17:47:54 UTC Major FPC 0 Hard errors 2020-06-01 16:24:14 UTC Major FPC 5 Major Errors - TOE Error code: 0x10409
Which went unnoticed as the alerts were ACKed for the FPC0 issue.