Page MenuHomePhabricator

Juniper monitoring
Closed, ResolvedPublic

Description

Over at T83978 we discovered that asw-ulsfo had an outstanding chassis alarm for about 4 months now. This is just an example of a recurring issue that happens due to the lack of monitoring for our Juniper router/switches.

We should create Icinga checks (or something equivalent) for:

  • "show chassis alarms"
  • (critical) BGP peerings
  • critical interfaces being down (e.g. all router interfaces)
  • VRRP
  • virtual-chassis NotPrsnt (or similar)
  • BFD sessions
  • OSPF/OSPFv3 sessions

Event Timeline

rtimport raised the priority of this task from to Medium.Dec 18 2014, 1:56 AM
rtimport added a project: ops-core.
rtimport set Reference to rt7654.
faidon created this task.Jun 10 2014, 5:04 AM
Dzahn added a comment.Jun 10 2014, 3:18 PM

Reference to ticket #7642 added by dzahn

Dzahn added a comment.Jun 10 2014, 4:46 PM

http://exchange.nagios.org/directory/Plugins/Hardware/Network-Gear/Cisco/Check-various-hardware-environmental-sensors/details

Status changed from 'new' to 'open' by RT_System

faidon updated the task description. (Show Details)Feb 11 2015, 10:16 AM
faidon added a project: observability.
faidon changed the visibility from "WMF-NDA (Project)" to "Public (No Login Required)".
faidon changed the edit policy from "WMF-NDA (Project)" to "All Users".
faidon set Security to None.
hashar added a subscriber: hashar.Feb 11 2015, 10:24 AM

Random possibility: logstash has a plugin to act as a SNMP trap receiver http://logstash.net/docs/1.4.2/inputs/snmptrap

For BGP peelings there must be a Nagios plugin handling it. IIRC there is a standard MIB that list peering remotes and their status.

For chassis alarm, Juniper has a private MIB mib-jnx-chassis. txt

You can poke Juniper Alarm MIB, the basic shell script http://exchange.nagios.org/directory/Plugins/Hardware/Network-Gear/Juniper/check_juniperalarm-2Esh/details look at the yellow/red alarm count. Could be a good start.

Restricted Application added a subscriber: Matanya. · View Herald TranscriptSep 10 2015, 8:06 PM
faidon updated the task description. (Show Details)Nov 27 2015, 1:13 PM
faidon updated the task description. (Show Details)
faidon updated the task description. (Show Details)Nov 27 2015, 9:29 PM

Change 281467 had a related patch set uploaded (by Faidon Liambotis):
Add check_jnx_alarms to check Juniper chassis alarms

https://gerrit.wikimedia.org/r/281467

hashar removed a subscriber: hashar.Apr 4 2016, 6:42 PM

Change 281467 merged by Faidon Liambotis:
Add check_jnx_alarms to check Juniper chassis alarms

https://gerrit.wikimedia.org/r/281467

faidon updated the task description. (Show Details)Apr 4 2016, 7:28 PM

Change 281495 had a related patch set uploaded (by Faidon Liambotis):
netops: monitor all asw/msw/psw as well

https://gerrit.wikimedia.org/r/281495

Change 281495 merged by Faidon Liambotis:
netops: monitor all asw/msw/psw as well

https://gerrit.wikimedia.org/r/281495

faidon updated the task description. (Show Details)Apr 4 2016, 8:25 PM
faidon updated the task description. (Show Details)Jan 9 2017, 1:19 AM

A new check has been added to LibreNMS to monitor "show system alarms" (yellow and red)
As well as all the moving parts (PSU/FAN/etc...)

ayounsi moved this task from Backlog to Monitoring on the netops board.Jun 27 2017, 2:47 PM

Change 369710 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Icinga: add check_bfd check (part 1)

https://gerrit.wikimedia.org/r/369710

Change 369710 merged by Ayounsi:
[operations/puppet@production] Icinga: add check_bfd check (part 1)

https://gerrit.wikimedia.org/r/369710

Change 370103 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Icinga: add check_bfd check (part 1)

https://gerrit.wikimedia.org/r/370103

Change 370103 merged by Ayounsi:
[operations/puppet@production] Icinga: add check_bfd check (part 1)

https://gerrit.wikimedia.org/r/370103

Change 461498 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Icinga, assign bfd check to routers

https://gerrit.wikimedia.org/r/461498

Change 461503 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] SNMP: set snmp-mibs-downloader BASEDIR to Debian 9 standard

https://gerrit.wikimedia.org/r/461503

Change 461503 merged by Ayounsi:
[operations/puppet@production] SNMP: set snmp-mibs-downloader BASEDIR to Debian 9 standard

https://gerrit.wikimedia.org/r/461503

Change 461498 merged by Ayounsi:
[operations/puppet@production] Icinga, assign bfd check to routers

https://gerrit.wikimedia.org/r/461498

ayounsi updated the task description. (Show Details)Mar 14 2019, 7:44 PM

Change 496873 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Icinga: Add OSPF check to routers

https://gerrit.wikimedia.org/r/496873

ayounsi claimed this task.Mar 15 2019, 7:45 PM

Change 496873 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Icinga: Add OSPF check to routers

https://gerrit.wikimedia.org/r/496873

Change 496873 merged by Ayounsi:
[operations/puppet@production] Icinga: Add OSPF check to routers

https://gerrit.wikimedia.org/r/496873