I just now restarted nova-compute on virt1002... it seems to have just quietly stopped working :(
Description
Description
Details
Details
Project | Branch | Lines +/- | Subject | |
---|---|---|---|---|
operations/puppet | production | +5 -0 | Icinga monitoring for nova-compute process. |
Event Timeline
Comment Actions
Change 198249 had a related patch set uploaded (by Andrew Bogott):
Icinga monitoring for nova-compute process.
Comment Actions
In addition to process monitoring, Something should probably be running 'nova service list' on virt1000 and checking the status there -- in theory that's upgraded via queue messages so will verify that the services are actually responding rather than just locked up and occupying process space.
Comment Actions
Today, the nova-api process was running but api calls were timing out. So that's another thing to watch for.
Comment Actions
This is modestly different, but needs to be retitled. T42022 is about public http APIs, this is about internal services which can break despite the public APIs functioning.