Page MenuHomePhabricator

DNS: per prefix zone-file limitation
Closed, ResolvedPublic


I discovered today that the new DNS automation creates a dedicated file for each IP prefixes.
While this is fine for most production prefixes (as they are stable), it's more of a hassle for infrastructure prefixes (small ones like v4 /31s etc) as they require a matching change of the include in the DNS repo (during both creation and deletion). Which can cause a deploy to fail.

It's not a blocker but maybe there is a suitable way to address this limitation? For example:

  • Use the Netbox containers as zone boundaries
  • Use static /24 boundaries for v4

Event Timeline

ayounsi created this task.

As spoken on IRC, if there is a clean cut to identify when we can use /24 in terms of Netbox data of the prefixes and such, and if it's ok to loose some of the flexibility we have right now to migrate gradually some things it's not a problem for me to adapt the generation script.

If possible though I'll strongly advice if we have to do this to do it immediately, before we migrate the 2 main DCs as the transition is more complicated than the change itself.

If instead we keep the current per-prefix approach we can easily add some additional checks in the sre.dns.netbox cookbook such as:

  • for each modified auto-generated file check that they are $INCLUDEd in the dns repo (this can be done only after we've migrated everything with maybe a whitelist)
  • if there is any newly added file add a message to the user that a patch in the dns repo is needed to include the new file(s)
  • if there is any removed file pause and ask the user to make a patch to remove the INCLUDEs from the manual dns repo before proceeding with the push, to prevent the deploy to fail at all.

and something to the dns repo CI too:

  • check that all $INCLUDE files are unique (no duplicates)

After a chat with @ayounsi I tried the approach that if the IP prefixlen is > 24 instead of picking the smallest prefix we try to get the first prefix above with status container and use that instead for deciding the zonefile.

The actual code in case we go for this option will of course have a temporary patch that will generate both records to allow to switch the already migrated DCs from the previous INCLUDEs to those more compact INCLUDEs and it will be removed after that.

For now I've just tested the change itself that is easy to see as a diff, and pasted it here: P12937

As you can see there are some issues where the first container is larger than a /24. Are those ok? What filename should we use for them? The same syntax of smaller prefixes, hence for

For IPv4 you'll have to break up the <24 cases into /24 containers somehow, because they DNS zones themselves are /24 in that case. Perhaps we have to make some new containers in netbox, which represent the DNS-level abstraction, in this case?

Yeah I was chatting on IRC with Arzhel and he was suggesting to force them to /24 anyway, even if we don't have the prefix in Netbox.

I did that change and update the paste P12937 with the result.

Those are the file changes:

rename from to
rename from to
rename from to
rename from to
rename from to
rename from to
rename from to
rename from to

Change 632574 had a related patch set uploaded (by Volans; owner: Volans):
[operations/software/netbox-extras@master] dns: consolidate reverse zone files

Change 632952 had a related patch set uploaded (by Volans; owner: Volans):
[operations/software/netbox-extras@master] dns: consolidate reverse zone files (part 2)

Change 632953 had a related patch set uploaded (by Volans; owner: Volans):
[operations/dns@master] netbox: move $INCLUDEs to the consolidated files

The previous approach was not working well because I found corner cases where we were consolidating into the same file different subnets for which one is managed by Netbox and one not and will probably never be (like frack vs frack mgmt).

Moved to a smaller approach where we consolidate only /30 and /31 prefixes into their parent. This reduces the impact a lot, affecting basically only network devices almost but at the same time allow to reduce by ~48 the number of reverse zonefiles down to ~168.

Mentioned in SAL (#wikimedia-operations) [2020-10-08T20:43:34Z] <volans> deploying Netbox DNS zone consolidation - T264273

Change 632574 merged by Volans:
[operations/software/netbox-extras@master] dns: consolidate reverse zone files (part 1)

Change 632953 merged by Volans:
[operations/dns@master] netbox: move $INCLUDEs to the consolidated files

Change 632952 merged by Volans:
[operations/software/netbox-extras@master] dns: consolidate reverse zone files (part 2)

Volans claimed this task.

The agreed changes have been deployed, /30 and /31 prefixes have been consolidated into their parent prefix zone files.