During the latest switch maintenance (T329073) prometheus1005 was depooled from LVS, though not "depooled" from Alertmanager, in the sense that the host kept firing alerts (from its POV anyways). We should be more proactive and make sure we can effectively prevent a depooled host from sending alerts too during maintenance
Description
Description
Related Objects
Related Objects
- Mentioned Here
- T329073: eqiad row A switches upgrade
Event Timeline
Comment Actions
Change 900238 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):
[operations/puppet@production] DNM: test alertmanager depool for prometheus1006
Comment Actions
Setting alertmanagers: [] for the host in question is enough to remove its AM configuration, see also this PCC https://puppet-compiler.wmflabs.org/output/900238/40160/prometheus1006.eqiad.wmnet/index.html