Page MenuHomePhabricator

Inbound interface errors
Closed, ResolvedPublic

Description

Common information

  • description: Rule: Inbound interface errors Faults: #1: et-0/0/50 - Core: lsw1-e3-eqiad:et-0/0/55 {#G2108180567000814}

https://wikitech.wikimedia.org/wiki/Network_monitoring#LibreNMS_alerts

  • summary: Alert for device lsw1-f1-eqiad.mgmt.eqiad.wmnet - Inbound interface errors
  • timestamp: 2022-10-12 13:42:04
  • alertname: Inbound interface errors
  • instance: lsw1-f1-eqiad.mgmt.eqiad.wmnet
  • scope: global
  • severity: task
  • source: librenms
  • team: dcops

Firing alerts


  • description: Rule: Inbound interface errors Faults: #1: et-0/0/50 - Core: lsw1-e3-eqiad:et-0/0/55 {#G2108180567000814}

https://wikitech.wikimedia.org/wiki/Network_monitoring#LibreNMS_alerts

  • summary: Alert for device lsw1-f1-eqiad.mgmt.eqiad.wmnet - Inbound interface errors
  • timestamp: 2022-10-12 13:42:04
  • alertname: Inbound interface errors
  • instance: lsw1-f1-eqiad.mgmt.eqiad.wmnet
  • scope: global
  • severity: task
  • source: librenms
  • team: dcops
  • Source

Related Objects

Event Timeline

ayounsi triaged this task as High priority.Oct 13 2022, 5:21 AM

See https://librenms.wikimedia.org/graphs/to=1665638100/id=22639/type=port_errors/from=1665551700/

@Cmjohnson @Jclark-ctr please sync up with @cmooney or myself to proceed with the optic replacement.

@cmooney we can use this opportunity of having to disable the interface to change the MTU as well (cf. T315838)

@ayounsi thanks. And yes a good opportunity to change the MTU.

For now I've changed the OSPF metric either side to drain the link until we can swap out the optic.

cmooney@lsw1-e3-eqiad> show configuration protocols ospf | display set | match metri 
set protocols ospf area 0.0.0.0 interface et-0/0/55.0 metric 1000
cmooney@lsw1-f1-eqiad> show configuration protocols ospf | display set | match metri 
set protocols ospf area 0.0.0.0 interface et-0/0/50.0 metric 1000

@cmooney I will be available in 1 hour if you are still online

@Jclark-ctr yes I'm around, I'll drop you a line on irc too thanks.

Interface has been brought back up and traffic put on it again, showing error free after ~30mins so closing out task.

Thanks @Jclark-ctr for getting it sorted out quickly :)