Some feedback after using Alert Manager as transport for LibreNMS:
When ACKing an alert in LibreNMS, Alert Manager sends an email titled:
[FIRING:1] Access port utilisation over 80% for 1h global (asw2-b-eqiad.mgmt.eqiad.wmnet warning librenms netops)
And only at the very bottom it says:
title = Alert for device asw2-b-eqiad.mgmt.eqiad.wmnet - Access port utilisation over 80% for 1h got acknowledged
Same for IRC alerts:
jinxer-wm> (Access port utilisation over 80% for 1h) firing: Access port utilisation over 80% for 1h - https://alerts.wikimedia.org
Which is confusing as it's an ACK.
LibreNMS IRC bot had colors, but not the Alert Manager one, that's a major regression.
In the email body, all the LibreNMS details are crammed into the "Annotations" section, with no new lines, which make it difficult to parse.
The content is also duplicated. For example:
alertname = Access port utilisation over 80% for 1h
Rule: Access port utilisation over 80% for 1h
summary = Access port utilisation over 80% for 1h
instance = asw2-b-eqiad.mgmt.eqiad.wmnet
Device Name: asw2-b-eqiad.mgmt.eqiad.wmnet
The email title says [FIRING:1], not sure if it's needed or what the :1 means.
[FIRING:1] Inbound interface errors global (asw2-b-eqiad.mgmt.eqiad.wmnet warning librenms netops)
No need for the severity, scope or team in the email title, that's precious real-estate
Something like:
[Alert] asw2-b-eqiad.mgmt.eqiad.wmnet: Inbound interface errors
is more clear.
The "source" link is broken, it links to, for example "http://device/device=175"