Page MenuHomePhabricator

investigate ethernet errors: asw2-a5-eqiad port xe-0/0/36
Closed, ResolvedPublic

Description

asw2-a5-eqiad xe-0/0/36 has some input errors. It's part of ae0, which is the 4-port aggregate back to asw-a-eqiad, and is the only one of the 4 showing the errors. None of the 4 links (or the ae) is showing errors on the other side at asw-a-eqiad. The errors have been there since at least Sep 2013 in librenms's history.

This is possibly related to the pattern of inter-row elevated nutcracker timeout rates shown in: https://phabricator.wikimedia.org/T102199#1499358

Event Timeline

BBlack created this task.Jul 31 2015, 10:35 PM
BBlack raised the priority of this task from to High.
BBlack updated the task description. (Show Details)
BBlack added projects: acl*sre-team, netops.
BBlack added subscribers: BBlack, faidon, mark and 2 others.
Restricted Application added subscribers: Matanya, Aklapper. · View Herald TranscriptJul 31 2015, 10:35 PM
Mbch331 added a subscriber: Mbch331.Aug 1 2015, 5:59 AM
BBlack added a comment.Aug 3 2015, 3:17 PM

I've disabled the link (1/4 from aggregate) on both sides:

{master:0}[edit]
bblack@asw2-a5-eqiad# show|compare
[edit interfaces xe-0/0/36]
+   disable;

{master:0}[edit]
bblack@asw2-a5-eqiad# commit
{master:8}[edit]
bblack@asw-a-eqiad# show|compare
[edit interfaces xe-6/1/0]
+   disable;

{master:8}[edit]
bblack@asw-a-eqiad# commit

librenms graphs show ae0 errors have dropped back to zero.
Leaving this ticket open still to resolve the actual link issue and turn it back on.

faidon set Security to None.

I replaced the fiber. Let's turn it up and see if it's any better. Next step would be to replace optics

I turned it up, but it seems there is no link on it now:

xe-0/0/36       up    down Core: << asw-a-eqiad:xe-6/1/0 {#2169}

swapped both sfp's and the fiber

Cmjohnson closed this task as Resolved.Sep 2 2015, 7:17 PM
Cmjohnson claimed this task.

new fiber # is 3908....@faidon verified all looks good in IRC