Page MenuHomePhabricator

librenms page didn't auto-resolve in VO
Closed, InvalidPublic

Description

A librenms alert caused a page to go out via VO, and the incident go ack'd. However the ack expired 24h later and the incident paged again. The alert in icinga looks like it recovered, so the related incident should have been resolved automatically in VO

You have 1 incident.

Incident: 491
State:    Critical
Service:  icinga1001/LibreNMS has a  alert #page
Message:  Notification Type: RECOVERY

Service: LibreNMS has a critical alert #page
Host: icinga1001
Address: 208.80.154.84
State: OK

Date/Time: Fri Sept 18 12:46:13 UTC 2020

Notes URLs: https://bit.ly/wmf-librenms

Acknowledged by :

Additional Info:

OK: zero critical LibreNMS alerts
Link:     https://portal.victorops.com/client/wikimedia#/incident/491/incidentTimeline

Event Timeline

I don't think we've seen a recorrence of this problem, and we fixed the host-related recoveries in T264016: Host page did not auto-resolve in VO