Page MenuHomePhabricator

Cloud DNS: fix inconsistent ownership of reverse domains for openstack floating ip networks
Closed, ResolvedPublic

Description

We have 1 floating IP network (CIDR) in each openstack deployment:

  • 185.15.56.0/25 -- eqiad1. Zone 56.15.185.in-addr.arpa. owned by designate @ eqiad1 (wmflabsdotorg project)
  • 185.15.57.0/29 -- codfw1dev. Zone 57.15.185.in-addr.arpa owned by prod DNS servers.

In markmonitor:

arturo@endurance:~ $ whois 57.15.185.in-addr.arpa | grep "nserver"
nserver:        ns0.wikimedia.org
nserver:        ns1.wikimedia.org
nserver:        ns2.wikimedia.org
arturo@endurance:~ $ whois 56.15.185.in-addr.arpa | grep "nserver"
nserver:        cloud-ns0.wikimedia.org
nserver:        cloud-ns1.wikimedia.org

This is inconsistent. Probably the right thing to do is to delegate the /29 in codfw to designate @ codw1dev and update markmonitor.
The nserver entry requires also an update to ns[0-1].openstack.<deployment>.wikimediacloud.org, similar to what is happening in T247971: Cloud DNS: update markmonitor entries.

Project ownership should probably be set to cloudinfra (instead of wmflabsdotorg) to be consistent with https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/DNS
This may or may not require changes in other scripts or mechanism (proxies generate PTR records?)

Event Timeline

aborrero triaged this task as Medium priority.Mar 18 2020, 1:35 PM
aborrero updated the task description. (Show Details)
aborrero moved this task from Inbox to Soon! on the cloud-services-team (Kanban) board.

@Andrew ping me if you prefer me to handle this.

gotcha to be aware of:
the /25 you list for eqiad1 is accurate in that it looks like that's what neutron will allocate from (based on modules/openstack/templates/bootstrap/neutron/neutron_seed.sh.erb), but actually the whole /24 is reserved for the purpose. so this existing thing is okay.
The 57.15.185.in-addr.arpa zone is for the whole of 185.15.57.0/24 - note the /24, not /29. I'm not sure what's been reserved, this seems to be missing from modules/network/data/data.yaml? (and possibly other places?)

I'm not sure if we have anything updating that zone other than wmcs-dns-floating-ip-updater.py by the way

The whole /24 is marked in Netbox as cloud-codfw: https://netbox.wikimedia.org/ipam/prefixes/5/

Please @ayounsi confirm this range is allocated entirely for us.

It is reserved for you. And allocated as your needs grow. So far 185.15.57.0/29. As we're trying to be extra careful with v4 IPs.

I updated Netbox so it matches it's eqiad's equivalent: https://netbox.wikimedia.org/ipam/prefixes/1/prefixes/

OK thanks for confirming! I will:

  • drop the 57.15.185.in-addr.arpa zone zone from production DNS
  • add that reverse zone to designate / cloudinfra-codfw1dev

Change 586409 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/dns@master] 57.15.185.in-addr.arpa: drop zone

https://gerrit.wikimedia.org/r/586409

Mentioned in SAL (#wikimedia-cloud) [2020-04-06T17:39:07Z] <arturo> [codfw1dev] openstack zone create --email root@wmflabs.org --type PRIMARY --ttl 3600 --description "floating IPs subnet" 57.15.185.in-addr.arpa. (T247972)

Mentioned in SAL (#wikimedia-cloud) [2020-04-06T17:42:18Z] <arturo> [codfw1dev] transferred DNS zone 57.15.185.in-addr.arpa. to the cloudinfra-codfw1dev project (T247972)

Change 586409 merged by Arturo Borrero Gonzalez:
[operations/dns@master] 57.15.185.in-addr.arpa: drop zone

https://gerrit.wikimedia.org/r/586409

This seems OK now! thanks everyone!

Andrew reopened this task as Open.EditedApr 6 2020, 8:00 PM

I still see:

andrew@buster:~$ whois 56.15.185.in-addr.arpa | grep nserver
nserver: cloud-ns0.wikimedia.org
nserver: cloud-ns1.wikimedia.org
andrew@buster:~$ whois 57.15.185.in-addr.arpa | grep nserver
nserver: ns0.wikimedia.org
nserver: ns1.wikimedia.org
nserver: ns2.wikimedia.org

I briefly thought that we needed to update markmonitor for those, but @Krenair pointed out that it's a RIPE thing. @ayounsi, is that something you can update with them? It should be

57.15.185.in-addr.arpa:

ns0.openstack.codfw1dev.wikimediacloud.org 208.80.153.76

56.15.185.in-addr.arpa:

ns0.openstack.eqiad1.wikimediacloud.org 208.80.154.135

ns1.openstack.eqiad1.wikimediacloud.org 208.80.154.11

I chatted with Faidon about this a bit and he's going to check in with Arzhel about 57.15.185.in-addr.arpa

I started to look into that:

Trying to create the following domain object:

domain:         0-7.57.15.185.in-addr.arpa
descr:          Wikimedia_cloud_codfw
admin-c:        FAID1-RIPE
admin-c:        MBE96-RIPE
tech-c:         FAID1-RIPE
tech-c:         MBE96-RIPE
tech-c:         AY3199-RIPE
zone-c:         WMF-RIPE
nserver:        ns0.openstack.codfw1dev.wikimediacloud.org
mnt-by:         WIKIMEDIA-MNT
source:         RIPE

Returns a create error:

No response from nameserver for zone 57.15.185.in-addr.arpa when trying to fetch glue.
Not enough data about 0-7.57.15.185.in-addr.arpa was found to be able to run tests.

Can you double check that the NS is properly configured?

About 56.15.185.in-addr.arpa is it ok to do it anytime or do I need to sync up with you?

I fixed at least one thing with the 57.15.185.in-addr.arpa zone (the SOA was pointing to a currently broken resolver).

The 56. domain can be updated whenever, but I'd appreciate a ping when you do it.

Thanks!

Mentioned in SAL (#wikimedia-operations) [2020-05-18T17:14:17Z] <XioNoX> update domain object for 56.15.185.in-addr.arpa - T247972

eqiad
domain:         56.15.185.in-addr.arpa
descr:          Wikimedia_cloud_eqiad
admin-c:        FAID1-RIPE
admin-c:        MBE96-RIPE
tech-c:         FAID1-RIPE
tech-c:         MBE96-RIPE
tech-c:         AY3199-RIPE
zone-c:         WMF-RIPE
nserver:        ns0.openstack.eqiad1.wikimediacloud.org
nserver:        ns1.openstack.eqiad1.wikimediacloud.org
mnt-by:         WIKIMEDIA-MNT
source:         RIPE

The following object(s) were processed SUCCESSFULLY:
Modify SUCCEEDED: [domain] 56.15.185.in-addr.arpa

codfw still fails with the same error.

I spent some time digging through the RIPE doc, but can't find any clear answer for T247972#6130041.
@jbond do you have any idea?

Otherwise I'll reach out to the RIPE support.

Discussed it with John, so 57.15.185.in-addr.arpa is configured to have ns1/2/3.wikimedia.org as NS. Which is correct.

The next steps is to delegate 0-7.57.15.185.in-addr.arpa to ns0.openstack.codfw1dev.wikimediacloud.org
Examples are in the RFC https://tools.ietf.org/html/rfc2317
Unfortunately I don't know enough about DNS to help with the exact config needed though.

Reassigning to arturo in case he knows how to proceed and/or thinks that we don't actually need to do this :)

Change 597514 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/dns@master] templates: add reverse zone for 185.12.57.0/24 including cloud delegation

https://gerrit.wikimedia.org/r/597514

Change 597514 merged by Arturo Borrero Gonzalez:
[operations/dns@master] templates: add reverse zone for 185.12.57.0/24 including cloud delegation

https://gerrit.wikimedia.org/r/597514

TODO: verify that we can actually resolve stuff like dig 1.57.15.185.in-addr.arpa. Will probably need to take a look at designate @ codfw1dev

I noticed that the ns0.openstack.codfw1dev.wikimediacloud.org. servers are configured as name servers for the 57.15.185.in-addr.arpa. zone. however you actually need to configure the 0/29.57.15.185.in-addr.arpa. zone

mmmm

root@cloudcontrol2001-dev:~# openstack zone create --email root@wmflabs.org --type PRIMARY --ttl 3600 --description "floating IPs subnet" 0/29.57.15.185.in-addr.arpa.
Provided object is not valid. Got a ValueError error with message Domain 0/29.57.15.185.in-addr.arpa. is not match

investigating why designate complains about this.

Apparently the string gets evaluated against this regex: RE_ZONENAME = r'^(?!.{255,})(?:(?!\-)[A-Za-z0-9_\-]{1,63}(?<!\-)\.)+\Z' which doesn't allow the 0/29.

Mentioned in SAL (#wikimedia-cloud) [2020-05-25T16:35:57Z] <arturo> [codfw1dev] created zone 0-29.57.15.185.in-addr.arpa. (T247972)

Change 598502 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: codfw1dev: designate: update zone name for floating IP subnet

https://gerrit.wikimedia.org/r/598502

Change 598503 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/dns@master] 57.15.185-in-addr.arpa: refresh zone name of the delegation

https://gerrit.wikimedia.org/r/598503

Change 598502 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: codfw1dev: designate: update zone name for floating IP subnet

https://gerrit.wikimedia.org/r/598502

Change 598503 merged by Arturo Borrero Gonzalez:
[operations/dns@master] 57.15.185-in-addr.arpa: refresh zone name of the delegation

https://gerrit.wikimedia.org/r/598503

Ok, after a bit of back and forth, this works now:

arturo@endurance:~$ dig -x 185.15.57.1 +short
1.0-29.57.15.185.in-addr.arpa.
test.codfw1dev.wmcloud.org.
arturo@endurance:~$ dig -x 185.15.57.2 +short
2.0-29.57.15.185.in-addr.arpa.
test.codfw1dev.wmcloud.org.

When I saw openstack not accepting the / character I rushed to fill a bug report: https://bugs.launchpad.net/designate/+bug/1880583
I later disovered that the same can be done with the - character, an that's what I did.

BTW, did a couple of updates to https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/DNS/Designate because Designate seems a bit more robust in openstack rocky :-)