Page MenuHomePhabricator

contint2001.mgmt disappeared from Icinga
Closed, ResolvedPublic

Description

We had an Icinga probe for contint2001.mgmt until fairly recently (T283582) but it is no more existing.

It seems the fault is in modules/monitoring/manifests/host.pp . The catalogue does have a resource created for a monitoring::exported_nagios_host but it somehow does not create one for the management interface. Maybe to lack of ipmi facts?

The monitoring::host has been refactored in December

Event Timeline

This is happening because $facts['ipmi_lan'] is not populated and a closer look suggests idrac is unresponsive, both via bmc_config and ssh, i think you will need to have dcops take a look at the idrac port

added both T283582 (firmware upgrade) and T294276 (hardware upgrade) as parent tasks

cc: Release-Engineering-Team (Radar)

The IDRAC on this server needs reset. Please coordinate a day and time that is best for this server to be taken off line.

Thanks.

hashar changed the task status from Open to Stalled.Jan 13 2022, 8:52 PM

I am marking this one stalled since that was to investigate why the host vanished from Icinga. Lets follow up the IDRAC reset / upgrade on the parent task which is about that topic: T283582

hashar assigned this task to jbond.

The DRAC on contint2001.wikimedia.org has been upgraded (T283582) and contint2001.mgmt is back in Icinga.

Thank you!