We've been running recently into an issue where a few console servers have CPU at 100%, the related librenms alerts (and thus AM notifications) were flapping at regular intervals, e.g.
05:07 -jinxer-wm:#wikimedia-operations- (Processor usage over 85%) firing: Processor usage over 85% - https://alerts.wikimedia.org 05:12 -jinxer-wm:#wikimedia-operations- (Processor usage over 85%) resolved: Processor usage over 85% - https://alerts.wikimedia.org 07:37 -jinxer-wm:#wikimedia-operations- (Processor usage over 85%) firing: Processor usage over 85% - https://alerts.wikimedia.org 07:42 -jinxer-wm:#wikimedia-operations- (Processor usage over 85%) resolved: Processor usage over 85% - https://alerts.wikimedia.org 07:47 -jinxer-wm:#wikimedia-operations- (Processor usage over 85%) firing: Processor usage over 85% - https://alerts.wikimedia.org 07:57 -jinxer-wm:#wikimedia-operations- (Processor usage over 85%) firing: Processor usage over 85% - https://alerts.wikimedia.org 08:02 -jinxer-wm:#wikimedia-operations- (Processor usage over 85%) resolved: Processor usage over 85% - https://alerts.wikimedia.org
This is due to the interaction between librenms' interval for re-sending the alerts (e.g. 3h for the alert above) and the fact that AM expects clients to keep sending notifications while the alerts are active.
AFAICT AM's expectation is maximum ~5 minutes between clients sending notifications, while librenms poller interval we're using ATM is also 5 minutes.