Page MenuHomePhabricator

[neutron] Add alert for neutron-rpc-server service being down on cloudcontrols (and neutron agents)
Closed, ResolvedPublic

Description

Write the description below

This service is critical on the reporting of status of the neutron agents, and agents will be marked down if this service is not running, preventing the creation of any VMs that use any kind of network.

This task is to:

  • Create the new alert
  • Create the runbook for it

Alerting on neutron agents themselves being down will also help catch some errors early.

Event Timeline

dcaro triaged this task as High priority.Feb 23 2022, 9:55 AM
dcaro created this task.
dcaro updated the task description. (Show Details)
aborrero renamed this task from [neutron] Add alert for neutron-rpc-server service being down on coludcontrols to [neutron] Add alert for neutron-rpc-server service being down on cloudcontrols (and neutron agents).Mar 8 2022, 4:08 PM
aborrero updated the task description. (Show Details)

Change 802442 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/alerts@master] wmcs: Add alert for Neutron agents being down

https://gerrit.wikimedia.org/r/802442

Change 802442 merged by jenkins-bot:

[operations/alerts@master] wmcs: Add alert for Neutron agents being down

https://gerrit.wikimedia.org/r/802442

dcaro moved this task from To refine to Done on the User-dcaro board.