I just now restarted nova-compute on virt1002... it seems to have just quietly stopped working :(
- Mentioned In
- T132422: cronspam from labscontrol1001, labstore1001, labnet1002.eqiad.wmnet, labsdb1003.eqiad.wmnet
rOPUPebb072f33cd4: Add process monitoring for nova services.
rOPUP202303489379: Icinga monitoring for nova-compute process.
- Mentioned Here
- T42022: Add icinga checks for all nova, glance, and keystone related services
In addition to process monitoring, Something should probably be running 'nova service list' on virt1000 and checking the status there -- in theory that's upgraded via queue messages so will verify that the services are actually responding rather than just locked up and occupying process space.