For the past week or so, the Varnish traffic drop alert for specifically codfw has been noisy: https://logstash.wikimedia.org/goto/1ec92a9c4ab13292b83f21403e7052d1
This does seem to correlate with some odd minute-to-minute spikiness happening to codfw's traffic flow https://w.wiki/ChG which perhaps should be investigated as well
One of the things I think we should do is to add an absolute minimum traffic level required to alert, since a simple ratio will always be subject to this kind of noise. Here's a plot of one way we could express that in PromQL: https://w.wiki/ChE
(We might also want to make the traffic drop alerts based off of ATS metrics and not Varnish frontend ones?)