Page MenuHomePhabricator

Alert on coreDNS misbehaving
Closed, DeclinedPublic

Description

We recently added CoreDNS to our kubernetes cluster and we already have a nice dashboard in grafana[1]. We should be alerting on various things that may go awry on the coredns level as they can highly cause issues to services running in the clusters.

[1] https://grafana.wikimedia.org/d/-sq5te5Wk/kubernetes-dns?orgId=1

Event Timeline

MLechvien-WMF lowered the priority of this task from High to Low.
MLechvien-WMF added subscribers: JMeybohm, MLechvien-WMF.

Lowering priority as this was not actioned for a very long time.

@JMeybohm do you know if the description is still accurate and if there is anything actionable for us?

Given it's been a while since we added CoreDNS, I can't recall a major outage and we have not revisited this since, I'm boldly closing it.