Page MenuHomePhabricator

Host deployment-puppetdb01 is DOWN: CRITICAL - Host Unreachable (10.68.23.76)
Closed, ResolvedPublic

Description

Spotted just a few minutes ago on RelEng:

<shinken-wm> PROBLEM - Host deployment-puppetdb01 is DOWN: CRITICAL - Host Unreachable (10.68.23.76)

Can somebody please do some maintenance here, please? Those kind of errors are so frequent that they defeat the purpose of Beta Cluster being an alike environment as production. Thanks!

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 19 2018, 3:36 PM
MarcoAurelio triaged this task as High priority.Feb 19 2018, 3:36 PM

PROBLEM - Puppet errors on deployment-secureredirexperiment is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]

PROBLEM - Puppet errors on deployment-mx02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]

I think this was just deleted but was never removed from shinken.

I think this was just deleted but was never removed from shinken.

It just needs to be removed from shinken or from somewhere else as well? There's a lot of other hosts "down" there as well. Thanks.

@Krenair created that instance according to openstack browser. Can you tell whether this instance is still needed, and if yes, why it is shutdown? Thanks.

Yes, this instance's presence will not be optional in future, it will be needed for things like T194962 (and also it should've been being used for things like SSH host key gathering, what happened to that?)

(and also it should've been being used for things like SSH host key gathering, what happened to that?)

Are you sure you really want to know? https://tools.wmflabs.org/sal/log/AWK-dPxoCdtJF0893Y-h

Krenair added a subscriber: Joe.May 25 2018, 6:17 PM

According to the audit log, @Joe shut it down 10 Jan 2018, 2:51 p.m.

Krenair closed this task as Resolved.May 26 2018, 1:30 PM
Krenair claimed this task.

It's back and Puppet is behaving.