Page MenuHomePhabricator

neutron: clarify why DNS extension is not enabled
Closed, ResolvedPublic

Description

Per the upstream docs https://docs.openstack.org/designate/latest/user/neutron-integration.html

One of these extensions must be enabled to allow Neutron and, via Neutron, Nova to automatically create DNS recordsets in Designate:
    dns-integration
    dns-domain-ports (includes dns-integration)
    subnet-dns-publish-fixed-ip (includes dns-integration and dns-domain-ports)
    dns-integration-domain-keywords (includes all others)

However:

aborrero@cloudcontrol1005:~$ sudo wmcs-openstack extension list --network -f value -c Alias | grep -i dns
[ .. nothing .. ]

I wonder how we do this.

Related Objects

Event Timeline

Change #1082736 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] openstack: neutron: ml2: template extension_drivers config option

https://gerrit.wikimedia.org/r/1082736

aborrero changed the task status from Open to In Progress.EditedOct 24 2024, 11:00 AM
aborrero raised the priority of this task from Low to Medium.
aborrero moved this task from Backlog to Doing on the User-aborrero board.

In codfw1dev, I will do the following to test the upstream extension:

  • in cloudcontrol nodes, disable puppet
  • in cloudcontrol nodes, disable the custom designate WMF sink configuration options
  • in cloudnet nodes, disable puppet
  • in cloudnet nodes, enable the dns-integration-domain-keywords extension

in cloudcontrol nodes:

/etc/designate/designate.conf
#[service:sink]
# List of notification handlers to enable, configuration of these needs to
# correspond to a [handler:my_driver] section below or else in the config
# Can be one or more of : nova_fixed, neutron_floatingip
#enabled_notification_handlers = nova_fixed_multi, wmf_sink
/etc/neutron/plugins/ml2/ml2_conf.ini
[ml2]
# ... more stuff
extension_drivers = port_security, dns_domain_keywords

also, in cloudcontrol nodes:

/etc/neutron/neutron.conf
[DEFAULT]
external_dns_driver = designate

# ... more stuff

[designate]
url = https://openstack.codfw1dev.wikimediacloud.org:29001
auth_type = password
auth_url = https://openstack.codfw1dev.wikimediacloud.org:25357
username = novaadmin
password = <<redacted>>
project_name = cloudinfra-codf1wdev
project_domain_name = default
user_domain_name = default
allow_reverse_dns_lookup = True
ipv4_ptr_zone_prefix_size = 24
ipv6_ptr_zone_prefix_size = 64 
ptr_zone_email = root@wmcloud.org

after the file changes above, I see this:

aborrero@cloudcontrol2006-dev:~$ sudo wmcs-openstack extension list --network -f value -c Alias | grep -i dns
dns-integration
dns-domain-ports
dns-integration-domain-keywords
subnet-dns-publish-fixed-ip

Which indicates the extensions were correctly loaded.

on the first test, it seems neutron expects a literal zone <projectname>.eqiad1.wikimedia.cloud to exist in order to create the A/AAAA records.

This seems reasonable to me.

What we were doing so far is to have all the records in the parent eqiad1.wikimedia.cloud zone created via designate sink.

I will experiment with this.

aborrero opened https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/102

codfw1dev: add dedicated VM zones for admin-monitoring and cloudinfra-codfw1dev

aborrero merged https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/102

codfw1dev: add dedicated VM zones for admin-monitoring and cloudinfra-codfw1dev

got a traceback:

2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration [req-f7516c6a-f20c-48a4-b570-105f7969a0a5 req-35dc96ef-6d04-4faa-a9c0-5c29d18cace7 aborrero cloudinfra-codfw1dev - - default default] Error publishing port data in external DNS service. Name: 'neutron-dns-ext-test'. Domain: 'cloudinfra-codfw1dev.codfw1dev.wikimedia.cloud.'. DNS service driver message 'Domain cloudinfra-codfw1dev.codfw1dev.wikimedia.cloud. not found in the external DNS service': neutron_lib.exceptions.dns.DNSDomainNotFound: Domain cloudinfra-codfw1dev.codfw1dev.wikimedia.cloud. not found in the external DNS service
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration Traceback (most recent call last):
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration   File "/usr/lib/python3/dist-packages/neutron/services/externaldns/drivers/designate/driver.py", line 66, in create_record_set
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration     designate.recordsets.create(dns_domain, dns_name, 'A', v4)
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration   File "/usr/lib/python3/dist-packages/designateclient/v2/recordsets.py", line 45, in create
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration     name, zone_info = self._canonicalize_record_name(zone, name)
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration   File "/usr/lib/python3/dist-packages/designateclient/v2/recordsets.py", line 29, in _canonicalize_record_name
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration     zone_info = self.client.zones.get(zone)
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration   File "/usr/lib/python3/dist-packages/designateclient/v2/zones.py", line 54, in get
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration     zone = v2_utils.resolve_by_name(self.list, zone)
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration   File "/usr/lib/python3/dist-packages/designateclient/v2/utils.py", line 32, in resolve_by_name
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration     results = func(criterion={"name": f"{name}"}, *args)
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration   File "/usr/lib/python3/dist-packages/designateclient/v2/zones.py", line 51, in list
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration     return self._get(url, response_key='zones')
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration   File "/usr/lib/python3/dist-packages/designateclient/v2/base.py", line 30, in _get
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration     resp, body = self.client.session.get(url, **kwargs)
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration   File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 393, in get
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration     return self.request(url, 'GET', **kwargs)
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration   File "/usr/lib/python3/dist-packages/designateclient/v2/client.py", line 109, in request
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration     raise exceptions.NotFound(**response_payload)
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration designateclient.exceptions.NotFound: NotFound
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration 
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration During handling of the above exception, another exception occurred:
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration 
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration Traceback (most recent call last):
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration   File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/extensions/dns_integration.py", line 489, in _send_data_to_external_dns_service
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration     dns_driver.create_record_set(context, dns_domain, dns_name, records)
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration   File "/usr/lib/python3/dist-packages/neutron/services/externaldns/drivers/designate/driver.py", line 70, in create_record_set
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration     raise dns_exc.DNSDomainNotFound(dns_domain=dns_domain)
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration neutron_lib.exceptions.dns.DNSDomainNotFound: Domain cloudinfra-codfw1dev.codfw1dev.wikimedia.cloud. not found in the external DNS service
2024-10-24 15:34:54.073 1987217 ERROR neutron.plugins.ml2.extensions.dns_integration

However, the zone exists:

aborrero@cloudcontrol2006-dev:~ $ sudo wmcs-openstack zone show --all-projects 6d1f66ec-e5ac-4189-8862-e71f57924b06
+----------------+--------------------------------------------------------------------------------+
| Field          | Value                                                                          |
+----------------+--------------------------------------------------------------------------------+
| action         | NONE                                                                           |
| attributes     |                                                                                |
| created_at     | 2024-10-24T15:24:06.000000                                                     |
| description    | DNS domain for VMs in the cloudinfra-codfw1dev project - managed by tofu-infra |
| email          | root@wmcloud.org                                                               |
| id             | 6d1f66ec-e5ac-4189-8862-e71f57924b06                                           |
| masters        |                                                                                |
| name           | cloudinfra-codfw1dev.codfw1dev.wikimedia.cloud.                                |
| pool_id        | 794ccc2c-d751-44fe-b57f-8894c9f5c842                                           |
| project_id     | cloudinfra-codfw1dev                                                           |
| serial         | 1729783446                                                                     |
| shared         | False                                                                          |
| status         | ACTIVE                                                                         |
| transferred_at | None                                                                           |
| ttl            | 3600                                                                           |
| type           | PRIMARY                                                                        |
| updated_at     | 2024-10-24T15:24:12.000000                                                     |
| version        | 2                                                                              |
+----------------+--------------------------------------------------------------------------------+
aborrero@cloudcontrol2006-dev:~ $ sudo wmcs-openstack recordset list --all-projects 6d1f66ec-e5ac-4189-8862-e71f57924b06
+--------------------------------------+----------------------+-------------------------------------------------+------+----------------------------------------------------------------------------------------------+--------+--------+
| id                                   | project_id           | name                                            | type | records                                                                                      | status | action |
+--------------------------------------+----------------------+-------------------------------------------------+------+----------------------------------------------------------------------------------------------+--------+--------+
| 42f3201a-77ba-4db3-83a6-ef6d4c8e0d16 | cloudinfra-codfw1dev | cloudinfra-codfw1dev.codfw1dev.wikimedia.cloud. | SOA  | ns0.openstack.codfw1dev.wikimediacloud.org. root.wmcloud.org. 1729783446 3500 600 86400 3600 | ACTIVE | NONE   |
| c8f57f55-b9ed-4bf1-85b5-e47a4c0b9d13 | cloudinfra-codfw1dev | cloudinfra-codfw1dev.codfw1dev.wikimedia.cloud. | NS   | ns0.openstack.codfw1dev.wikimediacloud.org.                                                  | ACTIVE | NONE   |
|                                      |                      |                                                 |      | ns1.openstack.codfw1dev.wikimediacloud.org.                                                  |        |        |
+--------------------------------------+----------------------+-------------------------------------------------+------+----------------------------------------------------------------------------------------------+--------+--------+

I believe I found the reason of the traceback.

The designate auth used by neutron is not capable of operating on DNS zones outside of the scope of the current session. This is a common problem with the designate auth.

see code:

The use get_all_projects_client() in the cleanup routine, to try preventing leaks caused by lack of auth.

A potential solution is to force all_projects=True also in the creation routine.

as a way to test this theory, I will be using this patch in cloudcontrol2004-dev:

@@ -43,10 +43,10 @@
             CONF, 'designate')
 
     auth = token_endpoint.Token(CONF.designate.url, context.auth_token)
-    client = d_client.Client(session=_SESSION, auth=auth)
+    client = d_client.Client(session=_SESSION, auth=auth, all_projects=True)
     admin_auth = loading.load_auth_from_conf_options(CONF, 'designate')
     admin_client = d_client.Client(session=_SESSION, auth=admin_auth,
-                                   endpoint_override=CONF.designate.url)
+                                   endpoint_override=CONF.designate.url, all_projects=True)
     return client, admin_client

I could not make it work with that diff. My next theory is that neutron is not even reaching out to the designate API for some reason. I cannot find any trace of the request from neutron to designate.

I am running out of ideas for further debugging this.

I will undo all the changes for now.

aborrero changed the task status from In Progress to Stalled.Oct 28 2024, 4:57 PM
aborrero moved this task from Doing to Blocked/waiting on the User-aborrero board.

aborrero opened https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/108

Revert "codfw1dev: use neutron extension to publish DNS records for ports"

aborrero merged https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/108

Revert "codfw1dev: use neutron extension to publish DNS records for ports"

aborrero opened https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/111

codfw1dev: drop dedicated DNS zone cloudinfra-codfw1dev.codfw1dev.wikimedia.cloud

aborrero merged https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/111

codfw1dev: drop dedicated DNS zone cloudinfra-codfw1dev.codfw1dev.wikimedia.cloud

I'm glad you looked at this. In theory the neutron/dns extension is superior to using designate sink, since it makes updates synchronously, and VM creation/deletion would fail if the dns entry fails.

Every time I look at moving from sync to neutron/dns though, I get bogged down in our use case being different (and broader) from the single case that neutron supports. Sink is explicitly pluggable, whereas neutron dns is not, so we'd need to do some live patching of neutron which I don't love.

Maybe the correct solution is to /make/ neutron pluggable?