cloudvps: eqiad1: create DNS PTR records for cloud addresses
Closed, ResolvedPublic

Description

In T202750 @ayounsi mentioned we should create the PTR records for cloud addresses.

By now I will only think on eqiad1 which is the new deployment, to try to do things right from the beginning.

  • 185.15.56.0/25 - floating IPs
  • 10.64.22.0/24 - labs-instance-transport1-b-eqiad
  • 172.16.0.0/21 - labs-instances2-b-eqiad

Any suggestions for the naming scheme of these static FQDNs?

Some ideas:

  • 185.15.56.1 - cloud-floating-eqiad1-185-15-56-1.wikimedia.org
  • 10.64.22.1 - cloud-instance-transport1-b-10-64-22-1.eqiad.wmnet
  • 172.16.0.1 - cloud-instances2-b-172-16-0-1.eqiad.wmnet
aborrero triaged this task as Normal priority.

The floating IPs will get managed dynamically on the labs nameservers, the PTR should always include the actual instance name and project, not just a useless copy of the IP.

The floating IPs will get managed dynamically on the labs nameservers, the PTR should always include the actual instance name and project, not just a useless copy of the IP.

Ok, @ayounsi does this sounds good?

I'm not the authority in term of naming.

That said, it sounds like a great idea.

The floating IPs will get managed dynamically on the labs nameservers, the PTR should always include the actual instance name and project, not just a useless copy of the IP.

Other vlans should have the name of the host as well (even more so when the hosts are static), plus the interface if not the primary one (eg. eth1-2120.labtestneutron2001.codfw.wmnet. or vl2001-eth1.lvs2004.codfw.wmnet.)

Krenair added a comment.EditedAug 27 2018, 5:31 PM

So we chatted about this on -cloud-admin and here's roughly what me, @Andrew and @aborrero have come up with

  1. subnet-name-ip-addr.somedomain names aren't particularly helpful
  2. floating address names should be managed dynamically as they are in the current deployment, with https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/445310/ and https://gerrit.wikimedia.org/r/#/c/operations/dns/+/445303/
  3. transport address (which likely include a single physical router as well as one or more virtual routers managed by neutron) names *may* be managed statically in operations/dns.git as only cloud admins will be able to create/delete stuff here. it could be done dynamically if someone really wanted or we needed them to be created/removed automatically by scripts, with just a manual addition for the physical core router. The names used here should indicate which router the address is actually for.
  4. the instances subnet's dynamic naming in the new deployment should currently be working. Let's open a separate task if it isn't (might be a missing delegation??)

So:

So, transport IPs, we've already got three physical addresses assigned there to physical routers:

; 10.64.22.0/24 - cloud-instance-transport1-b-eqiad
$ORIGIN 22.64.{{ zonename }}.
1   1H IN PTR   vrrp-gw-1120.eqiad.wmnet.
2   1H IN PTR   ae2-1120.cr1-eqiad.wikimedia.org.
3   1H IN PTR   ae2-1120.cr2-eqiad.wikimedia.org.

I guess someone with access to the admin project (novaobserver doesn't have it) needs to go through, find all the virtual routers attached to it, name them, and stick the results in operations/dns.git

Change 460320 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/dns@master] cloudvps: eqiad1: add cloudinstances2b virtual router FQDNs

https://gerrit.wikimedia.org/r/460320

Change 461024 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs pdns-recursor: comma-delimit reverse lookup zones

https://gerrit.wikimedia.org/r/461024

Change 461024 merged by Andrew Bogott:
[operations/puppet@production] wmcs pdns-recursor: comma-delimit reverse lookup zones

https://gerrit.wikimedia.org/r/461024

The above patch fixes PTRs for internal and floating eqiad1 addresses. Floating IPs still need work:

andrew@eqiad1test:~$ dig +short -x 172.16.1.177
eqiad1test.testlabs.eqiad.wmflabs.
andrew@eqiad1test:~$ dig +short @labs-recursor0.wikimedia.org -x 172.16.1.177
eqiad1test.testlabs.eqiad.wmflabs.
andrew@eqiad1test:~$ dig +short @labs-recursor1.wikimedia.org -x 172.16.1.177
eqiad1test.testlabs.eqiad.wmflabs.
andrew@eqiad1test:~$ dig +short @labs-recursor2.wikimedia.org -x 172.16.1.177
eqiad1test.testlabs.eqiad.wmflabs.
andrew@eqiad1test:~$ dig +short -x 185.15.56.28
andrew@eqiad1test:~$ dig +short @labs-ns0.wikimedia.org -x 185.15.56.28
gerrit.git.wmflabs.org.

Change 461126 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs pdns: added forwarding for floating IP ptr records

https://gerrit.wikimedia.org/r/461126

Change 461126 merged by Andrew Bogott:
[operations/puppet@production] wmcs pdns: added forwarding for floating IP ptr records

https://gerrit.wikimedia.org/r/461126

floating IPs should work now too.

andrew@util-abogott:~$ dig +short  -x 185.15.56.28
gerrit.git.wmflabs.org.

Not for me:

alex@alex-laptop:~$ dig -x 185.15.56.28

; <<>> DiG 9.11.3-1ubuntu1.2-Ubuntu <<>> -x 185.15.56.28
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 32036
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;28.56.15.185.in-addr.arpa.	IN	PTR

;; Query time: 341 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Mon Sep 24 17:12:21 BST 2018
;; MSG SIZE  rcvd: 54

Ah, you're right -- it works from within the cloud network but the delegation on ns0 must still be wrong.

So floating and fixed IPs are working. Is the transport IP list complete? If so we can mark this done?

Krenair added a comment.EditedSep 24 2018, 11:38 PM

I just noticed that the 10.68.23.253 IP in T202636#4573567 is not named.
I've also found that 172.16.0.1 is not named.

Change 460320 merged by Faidon Liambotis:
[operations/dns@master] cloudvps: eqiad1: add cloudinstances2b virtual router FQDNs

https://gerrit.wikimedia.org/r/460320

aborrero closed this task as Resolved.Tue, Nov 27, 1:55 PM
aborrero added a subscriber: faidon.

Thanks @faidon. Closing task now.

@aborrero @faidon: This still appears to be missing 10.68.23.253 and 172.16.0.1

aborrero reopened this task as Open.Tue, Nov 27, 5:19 PM
aborrero moved this task from Blocked to Important on the cloud-services-team (Kanban) board.

Right, I forgot about them. I'll have to investigate, I don't know how to create records in the .wmflabs TLD. Good time to learn a new thing :-)

I was thinking on creating these records in the PDNS database living in cloudservices1003.wikimedia.org:

cloudinstances2b-gw-compat.eqiad.wmflabs        IN A    10.68.23.253
253.23.68.10.in-addr.arpa                       IN PTR  cloudinstances2b-gw-compat.eqiad.wmflabs.

cloudinstances2b-gw.eqiad.wmflabs               IN A    172.16.0.1
1.0.16.172.in-addr.arpa                         IN PTR  cloudinstances2b-gw.eqiad.wmflabs.

Question is, what will happen if somebody creates a VM with these names?

Question is, what will happen if somebody creates a VM with these names?

Let's find out!

I just created the records in the raw database:

cloudinstances2b-gw-compat.eqiad.wmflabs        IN A    10.68.23.253
253.23.68.10.in-addr.arpa                       IN PTR  cloudinstances2b-gw-compat.eqiad.wmflabs.

cloudinstances2b-gw.eqiad.wmflabs               IN A    172.16.0.1
1.0.16.172.in-addr.arpa                         IN PTR  cloudinstances2b-gw.eqiad.wmflabs.

INSERT INTO records (domain_id, name, type, content, ttl, prio, auth) VALUES \
        (2334, 'cloudinstances2b-gw-compat.eqiad.wmflabs', 'A', '10.68.23.253', 60, 0, 1);

INSERT INTO records (domain_id, name, type, content, ttl, prio, auth) VALUES \
        (2335, '253.23.68.10.in-addr.arpa', 'PTR', 'cloudinstances2b-gw-compat.eqiad.wmflabs', 60, 0, 1);

INSERT INTO records (domain_id, name, type, content, ttl, prio, auth) VALUES \
        (2334, 'cloudinstances2b-gw.eqiad.wmflabs', 'A', '172.16.0.1', 60, 0, 1);

INSERT INTO records (domain_id, name, type, content, ttl, prio, auth) VALUES \
        (27333, '1.0.16.172.in-addr.arpa', 'PTR', 'cloudinstances2b-gw.eqiad.wmflabs', 60, 0, 1);

SELECT * FROM records WHERE name = 'cloudinstances2b-gw-compat.eqiad.wmflabs';
SELECT * FROM records WHERE name = '253.23.68.10.in-addr.arpa';
SELECT * FROM records WHERE name = 'cloudinstances2b-gw.eqiad.wmflabs';
SELECT * FROM records WHERE name = '1.0.16.172.in-addr.arpa';

If we need to delete the records:

DELETE FROM records WHERE name = 'cloudinstances2b-gw-compat.eqiad.wmflabs';
DELETE FROM records WHERE name = '253.23.68.10.in-addr.arpa';
DELETE FROM records WHERE name = 'cloudinstances2b-gw.eqiad.wmflabs';
DELETE FROM records WHERE name = '1.0.16.172.in-addr.arpa';

This didn't have any apparent effect, we may need to refresh/restart something (or introduce the records using other method):

aborrero@tools-bastion-03:~$ dig -x 172.16.0.1 +short
aborrero@tools-bastion-03:~$ dig -x 253.23.68.10 +short
aborrero@tools-bastion-03:~$ dig cloudinstances2b-gw.eqiad.wmflabs +short
aborrero@tools-bastion-03:~$ dig cloudinstances2b-gw-compat.eqiad.wmflabs +short

Please @Andrew or @Krenair advice.

I realize now that the records have been deleted from the PDNS database after a AXFR (from designate?) I might need to add the records in the designate DB in m5-master instead.

mmmm creating a loose record without an associated instance in the eqiad.wmflabs domain might be flushed by the dnsleaks.py script.

aborrero closed this task as Resolved.Fri, Nov 30, 2:11 PM

Using the designate CLI I've created the following records:

cloudinstances2b-gw-compat.svc.eqiad.wmflabs    IN A    10.68.23.253
253.23.68.10.in-addr.arpa                       IN PTR  cloudinstances2b-gw-compat.svc.eqiad.wmflabs.

cloudinstances2b-gw.svc.eqiad.wmflabs           IN A    172.16.0.1
1.0.16.172.in-addr.arpa                         IN PTR  cloudinstances2b-gw.svc.eqiad.wmflabs.

And refreshed our DNS docs along the way: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/DNS

Mentioned in SAL (#wikimedia-cloud) [2018-12-03T13:25:09Z] <arturo> T202886 create again PTR records after dnsleak.py fix

Change 477273 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: dnsleaks.py: a PTR entry may have several records

https://gerrit.wikimedia.org/r/477273

Change 477273 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] openstack: dnsleaks.py: a PTR entry may have several records

https://gerrit.wikimedia.org/r/477273