Page MenuHomePhabricator

[openstack] cloudservices + Designate are using different source addresses for local vs. remote updates
Closed, ResolvedPublic

Description

(follow-up from T346385)

The problem

In cloudservices hosts, /etc/designate/pools.yaml contains several configuration values for OpenStack Designate. These values dictate what Designate writes to the pdns MariaDB database, running on the same cloudservices hosts.

After adding the new "private" vlan, cloudservices are using different source addresses for local vs. remote updates, which is not possible to describe correctly in pools.yaml.

This means that when a cloudservices host is added/reimaged, the database must be edited manually to set the right values (details documented here).

More details

In eqiad1, we have the following setup:

HostAllowed Masters
cloudservices1005185.15.56.162 (itself), 172.20.1.5 (cloudservices1006)
cloudservices1006185.15.56.163 (itself), 172.20.2.4 (cloudservices1005)

The destination addresses for updates will be 185.15.56.162 (ns0 / 1005) and 185.15.56.163 (ns1 / 1006). As those addresses are in the 185.15.56.0/24 network, the hosts will use their cloud-private interface to get there, hence the 172.20.x addressing rather than 10.x.

Ideally we could just have 185.15.56.162 and 185.15.56.163 on both, covering the local and remote system in either case. But instead we need a different pair of IPs on each, as the systems are using different source addresses for local vs. remote updates. We could include all 4 IPs on both, but that doesn't seem to work because pdns then expects to see updates coming from both IPs and complains that "I'm not getting updates from xxx.xxx.xxx.xxxx".

Event Timeline

Andrew renamed this task from [openstack] cloudservices are using different source addresses for local vs. remote updates to [openstack] cloudservices + Designate are using different source addresses for local vs. remote updates.Feb 2 2024, 11:08 PM

This is silly, but I think the solution to this is moving Designate services onto cloudcontrol nodes. If we keep pdns on cloudservices nodes then all the traffic governed by clouds.yaml will be between hosts rather than local to one host, and we can use the private 172.x addresses everywhere.

I don't think there's a compelling reason for designate to be the one openstack service that lives elsewhere anyway, so this change might make our setup slightly more comprehensible. It will definitely simplify arturo's concern about galera grants in T340446

Change 995369 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] OpenStack Designate: move from cloudservices to cloudcontrols in codfw1dev

https://gerrit.wikimedia.org/r/995369

Change 995369 merged by Andrew Bogott:

[operations/puppet@production] OpenStack Designate: move from cloudservices to cloudcontrols in codfw1dev

https://gerrit.wikimedia.org/r/995369

Change 997539 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Designate: replace mcrouter as part of designate services

https://gerrit.wikimedia.org/r/997539

Change 997539 merged by Andrew Bogott:

[operations/puppet@production] Designate: replace mcrouter as part of designate services

https://gerrit.wikimedia.org/r/997539

Change 997553 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Further attempt to fix memcached port for designate

https://gerrit.wikimedia.org/r/997553

Change 997554 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Remove memcached cruft from codfw1dev cloudservice nodes

https://gerrit.wikimedia.org/r/997554

Change 997553 merged by Andrew Bogott:

[operations/puppet@production] Further attempt to fix memcached port for designate

https://gerrit.wikimedia.org/r/997553

Change 997561 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Yet further attempt to fix memcached port for designate

https://gerrit.wikimedia.org/r/997561

Change 997561 merged by Andrew Bogott:

[operations/puppet@production] Yet further attempt to fix memcached port for designate

https://gerrit.wikimedia.org/r/997561

Change 997576 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] designate pools.yaml: better distinguish between designate and pdns hosts

https://gerrit.wikimedia.org/r/997576

Change 997576 merged by Andrew Bogott:

[operations/puppet@production] designate pools.yaml: better distinguish between designate and pdns hosts

https://gerrit.wikimedia.org/r/997576

Change 997597 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Allow pdns to query designate-mdns on private interfaces

https://gerrit.wikimedia.org/r/997597

Change 997597 merged by Andrew Bogott:

[operations/puppet@production] Allow pdns to query designate-mdns on private interfaces

https://gerrit.wikimedia.org/r/997597

Change 997554 merged by Andrew Bogott:

[operations/puppet@production] Remove memcached cruft from codfw1dev cloudservice nodes

https://gerrit.wikimedia.org/r/997554

Change 997965 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] OpenStack Designate: move from cloudservices to cloudcontrols in eqiad

https://gerrit.wikimedia.org/r/997965

Change 997965 merged by Andrew Bogott:

[operations/puppet@production] OpenStack Designate: move from cloudservices to cloudcontrols in eqiad

https://gerrit.wikimedia.org/r/997965

Andrew claimed this task.

I've moved Designate to cloudcontrol nodes, with pdns services still running on cloudservices nodes. This means we now have consistent addressing everywhere.

Change 1007292 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:openstack: rabbitmq: remove designate_hosts entirely

https://gerrit.wikimedia.org/r/1007292

Change 1007292 merged by Majavah:

[operations/puppet@production] P:openstack: rabbitmq: remove designate_hosts entirely

https://gerrit.wikimedia.org/r/1007292