Page MenuHomePhabricator

icinga really needs to check puppet run success of passive icinga hosts
Open, NormalPublic

Description

Today we attempted to do an Icinga failover T214760

We hit our first snag when we discovered that, due to a pretty silly bug in the ircecho module, we discovered that puppet hadn't run successfully on icinga2001 in quite some time.

Anyway we fixed the bug in ircecho: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/489897/
and I also figured out that the last successful run on icinga2001 was prior to Jan 24 15:13:57 (the first failed run)

Not sure what caused the change but it doesn't really matter; we should have caught this then.

Related Objects

Event Timeline

CDanis created this task.Feb 11 2019, 11:35 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 11 2019, 11:35 PM
colewhite triaged this task as Normal priority.Feb 13 2019, 2:38 AM