It looks like shinken has been failing to determine puppet status for several days. It might be that I'm misunderstanding the console though.
@Phamhi, shinken is watching checks like
It looks like most of those pages are missing. Is this a result of some of your recent labmon/graphite work? (I could also believe that those tests were just removed entirely in favor of some Prometheus thing that shinken doesn't know about).
@Andrew this was almost certainly broken by me with https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/549241/ which replaced the diamond state tracking with Prometheus for T210993: Deprecate Diamond collectors in Cloud VPS.
I'm still a bit confused about what prometheus does and doesn't do. Some Prometheus docs mention that Prometheus can alert, which has me wondering if we need another alerting tool or can just use Prometheus directly?
I don't see it running, actually. It's pretty simple to set up if you have an email server for it to talk to (or pagerduty and friends). But if we'd want it coming through shinken or icinga, you need plugins and other services AFAIK.