@dcaro has been chasing a weird missing heartbeats issue with rabbitmq @ openstack.
Hints collected so far points to an incompatibility with the evenlet monke patching.
Some references:
- logs in rabbitmq:
2023-09-21 09:21:37.933201+00:00 [info] <0.621.918> connection <0.621.918> (10.64.20.44:47856 -> 208.80.154.73:5671 - nova-compute:2065772:90d3aeb5-cd29-413c-8821-d254c0884316): user 'nova' authenticated and granted access to vhost '/'
heartbeat_in_pthread
Type:
boolean
Default:
False
Run the health check heartbeat thread through a native python thread by default. If this option is equal to False then the health check heartbeat will inherit the execution model from the parent process. For example if the parent process has monkey patched the stdlib by using eventlet/greenlet then the heartbeat will be run through a green thread. This option should be set to True only for the wsgi services.- logs in cloudcontrol1005
2023-09-21 09:28:37.019735+00:00 [error] <0.2781.1416> closing AMQP connection <0.2781.1416> (10.64.151.3:57868 -> 208.80.155.102:5671 - uwsgi:3699517:e622f610-c3cc-4145-9a54-84ac5a46bad9): 2023-09-21 09:28:37.019735+00:00 [error] <0.2781.1416> missed heartbeats from client, timeout: 60s
- logs in cloudcontrol1005
Sep 21 09:02:18 cloudcontrol1005 nova-scheduler[3699380]: Modules with known eventlet monkey patching issues were imported prior to eventlet monkey patching: urllib3. This warning can usually be ignored if the caller is only importing and not executing nova code.
- logs in some hypervisor:
Sep 21 09:33:39 cloudvirt-wdqs1001 nova-compute[2069944]: Modules with known eventlet monkey patching issues were imported prior to eventlet monkey patching: urllib3. This warning can usually be ignored if the caller is only importing and not executing nova code.
- similar upstream bug: https://bugzilla.redhat.com/show_bug.cgi?id=1711794 [OSP15][deployment] AMQP heartbeat thread missing heartbeats when running under nova_api