Page MenuHomePhabricator

Add VCP stats monitoring
Closed, ResolvedPublic0 Estimated Story Points


To get better visibility on issues like T228823.

On the CLI this info can be seen with:
asw2-a-eqiad> show virtual-chassis vc-port statistics extensive

On SNMP this is exposed via a dedicated MIB:

The ideal would be to implement that in LibreNMS to have graphing and alerting, on both link usage and errors.

Quick workaround would be to extend the Icinga check to also look at the errors counters and alert if any is increasing.

Related Objects

Event Timeline

ayounsi triaged this task as Medium priority.Jul 24 2019, 2:12 AM
ayounsi created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Good news, this is already implemented with:

Bad news, for unknown reasons so far, the switches don't expose the proper interface data.
For example, from netmon1002:
`/usr/bin/snmpbulkwalk -v2c -c <community> -OQUs -m JUNIPER-VIRTUALCHASSIS-MIB -M /srv/deployment/librenms/librenms/mibs:/srv/deployment/librenms/librenms/mibs/junos udp:asw2-b-eqiad.mgmt.eqiad.wmnet:161 jnxVirtualChassisPortTable
Returns only OIDs like:
jnxVirtualChassisPortInPkts.2."vcp-255/0/48.32768" = 0
The 32768 is an internal sub-interface, with no value. There is no jnxVirtualChassisPortInPkts.2."vcp-255/0/48"
Running show snmp mib walk jnxVirtualChassisPortTable don't show clear interface names but all counters are at 0 or 1.

I tried asw2-a-eqiad and asw2-ulsfo.
trace is available on asw2-ulsfo# run file show /var/log/snmptrace.log if needed, but I couldn't find anything.
I'll follow up with JTAC...

Juniper also have a hack for old Junos, before the implementation of jnxVirtualChassisPortTable: probably not something we want to do.

Service Request ID 2019-0801-0611 has been created.

Mentioned in SAL (#wikimedia-operations) [2019-08-07T23:03:26Z] <XioNoX> set virtual-chassis vcp-snmp-statistics on asw-a-codfw - T228824

Mentioned in SAL (#wikimedia-operations) [2019-08-07T23:08:29Z] <XioNoX> set virtual-chassis vcp-snmp-statistics on asw2-ulsfo - T228824

This is working!
Why is that behind a configuration options and not enabled by default? I have no idea.
Will let those two sit overnight and roll it to the whole fleet if all good.

Mentioned in SAL (#wikimedia-operations) [2019-08-08T15:49:09Z] <XioNoX> set virtual-chassis vcp-snmp-statistics to all VC - T228824

ayounsi claimed this task.

We now have visibility on all VCPs;
They also benefit from the same alerting as regular ports for saturation and errors.