Page MenuHomePhabricator

The upgrade_openstack_node cookbook doesn't silence everything that needs silencing
Open, Needs TriagePublic

Description

During upgrade and reboot of cloudcontrol nodes today, quite a few alerts managed to fire despite the cookbook attempting to downtime things. Here's most of them:

  • Failure in check of cinder snapshots: unable to read output
  • NRPE failures:
  • cloudinfra project instance distribution
    • Disk space
    • Puppet last run
    • dhclient process
    • Check for snapshots leaked by cinder backup agen
  • Ensure NFS exports are maintained for new instances with NFS
  • CRITICAL - Expecting active but unit nfs-exportd is activating