Since creating the VarnishKafkaNoMessages alert we have been dogged by false positives.
Improvements have been made in this commit but there are still occasions when the current logic results in unwelcome alerts.
Such occasions may include:
- Rolling restarts of varnish servers
- Pooling and depooling of data centres
The Traffic team is aware of this behaviour and sometimes notifies the Data-Engineering team when this work happens.
However, we should ensure that we tune the alerting rules in order to avoid alert fatigue and the potential to overlook genuine incidents.