Page MenuHomePhabricator

Openstack Wallaby on Debian 11 Bullseye problems because eventlet and dnspython
Closed, ResolvedPublic

Description

After upgrading our openstack install @ codfw1dev to Openstack Wallaby on Debian 11 Bullseye, the neutron-rpc-server refused to start.

root@cloudcontrol2001-dev:~# source novaenv.sh 
root@cloudcontrol2001-dev:~# neutron agent-list
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
+--------------------------------------+--------------------+-------------------+-------------------+-------+----------------+---------------------------+
| id                                   | agent_type         | host              | availability_zone | alive | admin_state_up | binary                    |
+--------------------------------------+--------------------+-------------------+-------------------+-------+----------------+---------------------------+
| 228b6925-6b3e-464f-9d23-70e250b928f2 | Linux bridge agent | cloudnet2004-dev  |                   | xxx   | True           | neutron-linuxbridge-agent |
| 2f9bd1b1-e51f-47d4-b527-ccfd6b062f8b | DHCP agent         | cloudnet2004-dev  | nova              | xxx   | True           | neutron-dhcp-agent        |
| 46573e30-a4f0-4424-84c5-e18d7a1d0902 | Linux bridge agent | cloudvirt2003-dev |                   | xxx   | True           | neutron-linuxbridge-agent |
| 4a0e32d8-f231-4e50-9636-414b3e44cd53 | L3 agent           | cloudnet2002-dev  | nova              | xxx   | True           | neutron-l3-agent          |
| 5584e5f9-1e37-430c-b1cd-a3be0a1f1c5b | L3 agent           | cloudnet2004-dev  | nova              | xxx   | True           | neutron-l3-agent          |
| 6be877da-0221-4d44-813a-7e77868a2364 | Metadata agent     | cloudnet2002-dev  |                   | xxx   | True           | neutron-metadata-agent    |
| 73206678-6394-4d0e-9668-2c6cdf28b595 | Linux bridge agent | cloudvirt2002-dev |                   | xxx   | True           | neutron-linuxbridge-agent |
| 865072bb-941d-4d89-bb39-282df7fe7110 | DHCP agent         | cloudnet2002-dev  | nova              | xxx   | True           | neutron-dhcp-agent        |
| 98f75540-ec40-4b32-be19-33dd3c24c5b5 | Linux bridge agent | cloudvirt2001-dev |                   | xxx   | True           | neutron-linuxbridge-agent |
| cf504178-7bfe-4972-b2c6-0872cb829f2a | Metadata agent     | cloudnet2004-dev  |                   | xxx   | True           | neutron-metadata-agent    |
| e4828358-0291-4d00-a493-a866183689ee | Linux bridge agent | cloudnet2002-dev  |                   | xxx   | True           | neutron-linuxbridge-agent |
+--------------------------------------+--------------------+-------------------+-------------------+-------+----------------+---------------------------+

Logs:

root@cloudcontrol2001-dev:~# tail -1 /var/log/neutron/neutron-rpc-server.log
2022-03-31 11:33:35.335 2446864 WARNING oslo_db.sqlalchemy.engines [req-f619729c-d589-475e-957d-24025061c418 - - - - -] SQL connection failed. 10 attempts left.: oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on 'openstack.codfw1dev.wikimediacloud.org' ([Errno -3] Lookup timed out)")

@dcaro pointed at the combo of dnspython/eventlet as being troubled.

root@cloudcontrol2001-dev:~# apt-cache policy python3-eventlet python3-dnspython
python3-eventlet:
  Installed: 0.30.2-1
  Candidate: 0.30.2-1
  Version table:
 *** 0.30.2-1 1002
        500 http://mirrors.wikimedia.org/osbpo bullseye-wallaby-backports-nochange/main amd64 Packages
        100 /var/lib/dpkg/status
     0.26.1-8~wmf1 1001
       1001 http://apt.wikimedia.org/wikimedia bullseye-wikimedia/main amd64 Packages
     0.26.1-7+deb11u1 500
        500 http://mirrors.wikimedia.org/debian bullseye/main amd64 Packages
python3-dnspython:
  Installed: 2.0.0-1
  Candidate: 2.0.0-1
  Version table:
 *** 2.0.0-1 500
        500 http://mirrors.wikimedia.org/debian bullseye/main amd64 Packages
        100 /var/lib/dpkg/status

Supported by online comments:

The potential solution is to try with a different versions for eventlet/dnspython.

Event Timeline

Tried older dnspython on cloudcontrol2001-dev

ii  python3-dnspython                    1.16.0-1+deb10u1                          all          DNS toolkit for Python 3

Though still getting:

Apr 04 19:00:01 cloudcontrol2001-dev neutron-api[776687]: 2022-04-04 19:00:01.326 776687 ERROR neutron.plugins.ml2.managers [req-07e90c1c-f90d-4cab-8899-7456f03b8cad novaadmin admin - default default] Failed to bind port e78015af-490a-4f12-a1b4-82f7678e4444 on host cloudvirt2001-dev for vnic_type normal using segments [{'id': 'db9be46f-154e-4979-9198-f6399f9b9bdc', 'network_type': 'flat', 'physical_network': 'cloudinstances2b', 'segmentation_id': None, 'network_id': '05a5494a-184f-4d5c-9e98-77ae61c56daa'}]

on server creation.

Mentioned in SAL (#wikimedia-cloud) [2022-04-06T08:20:33Z] <arturo> [codfw1dev] trying with python3-dnspython 2.2.0-2 installed by hand on cloudcontrol servers (T305157)

Mentioned in SAL (#wikimedia-cloud) [2022-04-06T08:24:13Z] <arturo> [codfw1dev] trying with python3-dnspython 2.2.0-2 installed by hand on cloudvirt2003-dev (T305157)

The version of python3-eventlet that contains the mentioned DNS fixes is >= 0.30.2-3 per changelog at https://tracker.debian.org/media/packages/p/python-eventlet/changelog-0.30.2-5

Mentioned in SAL (#wikimedia-cloud) [2022-04-06T08:42:23Z] <arturo> [codfw1dev] trying with python3-eventlet 0.30.2-5 installed by hand on cloudcontrol servers (T305157)

Mentioned in SAL (#wikimedia-cloud) [2022-04-06T08:45:18Z] <arturo> [codfw1dev] trying with python3-eventlet 0.30.2-5 installed by hand on cloudvirt2003-dev (T305157)

Mentioned in SAL (#wikimedia-operations) [2022-04-06T09:04:38Z] <arturo> force-started update-openstack-mirror.service on mirror1001 for python3-eventlet (T305157)

Talked to fellow Debian Developers to ask them to put a newer version of python3-eventlet on the bullseye-wallaby repo.

We have now python3-eventlet version 0.30.2-5~bpo11+1 in the bullseye-wallaby repo, upgrading codfw1dev with that.

Mentioned in SAL (#wikimedia-cloud) [2022-04-06T09:12:16Z] <arturo> [codf1dev] installing python3-eventlet 0.30.2-5~bpo11+1 on all required servers (cloudvirt, cloudnet, cloudcontrol) (T305157)

aborrero claimed this task.

all agents are back online:

+--------------------------------------+--------------------+-------------------+-------------------+-------+----------------+---------------------------+
| id                                   | agent_type         | host              | availability_zone | alive | admin_state_up | binary                    |
+--------------------------------------+--------------------+-------------------+-------------------+-------+----------------+---------------------------+
| 228b6925-6b3e-464f-9d23-70e250b928f2 | Linux bridge agent | cloudnet2004-dev  |                   | :-)   | True           | neutron-linuxbridge-agent |
| 2f9bd1b1-e51f-47d4-b527-ccfd6b062f8b | DHCP agent         | cloudnet2004-dev  | nova              | :-)   | True           | neutron-dhcp-agent        |
| 46573e30-a4f0-4424-84c5-e18d7a1d0902 | Linux bridge agent | cloudvirt2003-dev |                   | :-)   | True           | neutron-linuxbridge-agent |
| 4a0e32d8-f231-4e50-9636-414b3e44cd53 | L3 agent           | cloudnet2002-dev  | nova              | :-)   | True           | neutron-l3-agent          |
| 5584e5f9-1e37-430c-b1cd-a3be0a1f1c5b | L3 agent           | cloudnet2004-dev  | nova              | :-)   | True           | neutron-l3-agent          |
| 6be877da-0221-4d44-813a-7e77868a2364 | Metadata agent     | cloudnet2002-dev  |                   | :-)   | True           | neutron-metadata-agent    |
| 73206678-6394-4d0e-9668-2c6cdf28b595 | Linux bridge agent | cloudvirt2002-dev |                   | :-)   | True           | neutron-linuxbridge-agent |
| 865072bb-941d-4d89-bb39-282df7fe7110 | DHCP agent         | cloudnet2002-dev  | nova              | :-)   | True           | neutron-dhcp-agent        |
| 98f75540-ec40-4b32-be19-33dd3c24c5b5 | Linux bridge agent | cloudvirt2001-dev |                   | :-)   | True           | neutron-linuxbridge-agent |
| cf504178-7bfe-4972-b2c6-0872cb829f2a | Metadata agent     | cloudnet2004-dev  |                   | :-)   | True           | neutron-metadata-agent    |
| e4828358-0291-4d00-a493-a866183689ee | Linux bridge agent | cloudnet2002-dev  |                   | :-)   | True           | neutron-linuxbridge-agent |
+--------------------------------------+--------------------+-------------------+-------------------+-------+----------------+---------------------------+

The python3-eventlet version 0.30.2-5~bpo11+1 is installed in the mirror and with higher priority on the apt cache, so we don't need any further operation when reimaging/upgrading servers to bullseye wallaby.