Page MenuHomePhabricator

Inbound errors on interface lsw1-d4-eqiad:ethernet-1/19 (an-worker1230 {#5330})
Closed, ResolvedPublic

Description

Common information

  • alertname: InboundInterfaceErrors
  • instance: lsw1-d4-eqiad:9804
  • interface_description: an-worker1230 {#5330}
  • interface_name: ethernet-1/19
  • job: gnmi
  • prometheus: ops
  • scope: network
  • severity: task
  • site: eqiad
  • source: prometheus
  • team: dcops

Firing alerts


Event Timeline

VRiley-WMF added a subscriber: BTullis.

@BTullis Hey Ben, we can replace this cable in order to clear up this error. Can you please let us know when the best time to swap it would be.

Hi @VRiley-WMF - Yes, please feel free to swap this cable any time. If it's a short outage, we can just let Hadoop deal with the network errors.

If you think that it's going to be a more sustained thing, more than a few minutes, then we can make a patch to exclude this host from the cluster.

This cable has been swapped out.

It'll take time in order to see if it fixes it. We'll see if it doesn't throw the error again