We are trying to convert the netbox service from an active active to an active passive services. I created a A change to update the services and a a change to update the dns. Then following the instructions for adding a new services (there where none for converting). I first deployed the service catalog change, ran puppet on all the dns servers which resulted in the following diff
diff --- /etc/gdnsd/discovery-geo-resources 2022-12-01 10:45:25.534169539 +0000 +++ /tmp/puppet-file20230220-11718-uhf72k 2023-02-20 14:03:09.370481312 +0000 @@ -295,7 +295,7 @@ } } -disc-geo-netbox => { +disc-netbox => { map => discovery-map, service_types => discovery-state-netbox, dcmap => { Info: Computing checksum on file /etc/gdnsd/discovery-geo-resources Info: /Stage[main]/Profile::Dns::Auth::Discovery/File[/etc/gdnsd/discovery-geo-resources]: Filebucketed /etc/gdnsd/discovery-geo-resources to puppet with sum 025570ce5eea62c5a927c5c9e48c39de Notice: /Stage[main]/Profile::Dns::Auth::Discovery/File[/etc/gdnsd/discovery-geo-resources]/content: content changed '{md5}025570ce5eea62c5a927c5c9e48c39de' to '{md5}6c2c530297335c722c222615df772b10' Info: /Stage[main]/Profile::Dns::Auth::Discovery/File[/etc/gdnsd/discovery-geo-resources]: Scheduling refresh of Service[gdnsd] Notice: /Stage[main]/Profile::Dns::Auth::Discovery/File[/etc/gdnsd/discovery-metafo-resources]/content: --- /etc/gdnsd/discovery-metafo-resources 2022-11-23 14:10:06.910243923 +0000 +++ /tmp/puppet-file20230220-11718-cq1ed8 2023-02-20 14:03:09.430481805 +0000 @@ -61,13 +61,6 @@ fail => %geoip!disc-failoid, }, } -disc-netbox => { - datacenters => [ geo, fail ], - dcmap => { - geo => %geoip!disc-geo-netbox, - fail => %geoip!disc-failoid, - }, -} disc-parsoid-php => { datacenters => [ geo, fail ], dcmap => { Info: Computing checksum on file /etc/gdnsd/discovery-metafo-resources Info: /Stage[main]/Profile::Dns::Auth::Discovery/File[/etc/gdnsd/discovery-metafo-resources]: Filebucketed /etc/gdnsd/discovery-metafo-resources to puppet with sum d635d54f0ea96f0f4334d2dcb82cd098 Notice: /Stage[main]/Profile::Dns::Auth::Discovery/File[/etc/gdnsd/discovery-metafo-resources]/content: content changed '{md5}d635d54f0ea96f0f4334d2dcb82cd098' to '{md5}a242567f46b8fd32d2c591f0e8fa06e3' Info: /Stage[main]/Profile::Dns::Auth::Discovery/File[/etc/gdnsd/discovery-metafo-resources]: Scheduling refresh of Service[gdnsd] Notice: /Stage[main]/Profile::Dns::Auth::Discovery/Confd::File[/var/lib/gdnsd/discovery-netbox.state]/File[/etc/confd/conf.d/_var_lib_gdnsd_discovery-netbox.state.toml]/content: --- /etc/confd/conf.d/_var_lib_gdnsd_discovery-netbox.state.toml 2022-05-31 13:59:21.463177299 +0000 +++ /tmp/puppet-file20230220-11718-u9si99 2023-02-20 14:03:14.246521376 +0000 @@ -13,5 +13,5 @@ ] prefix = "/conftool/v1" -check_cmd = "/usr/local/bin/confd-lint-wrap /usr/local/bin/authdns-check-active-passive {{.src}}" +
i then deployed the DNS change but received the following error
sudo authdns-update [12:51:34] Updating authdns1001.wikimedia.org (self)... Pulling the current revision from https://gerrit.wikimedia.org/r/operations/dns.git Reviewing a21746632e4b7fb90cb4745ce5fd6b7d678ef492... templates/wmnet | 2 +- utils/mock_etc/discovery-geo-resources | 1 + utils/mock_etc/discovery-metafo-resources | 1 - 3 files changed, 2 insertions(+), 2 deletions(-) diff --git templates/wmnet templates/wmnet index 03333182..7ad7f7f6 100644 --- templates/wmnet +++ templates/wmnet @@ -783,7 +783,7 @@ inference 300/10 IN DYNA geoip!disc-inference k8s-ingress-staging 300/10 IN DYNA metafo!disc-k8s-ingress-staging k8s-ingress-wikikube-ro 300/10 IN DYNA geoip!disc-k8s-ingress-wikikube-ro k8s-ingress-wikikube-rw 300/10 IN DYNA metafo!disc-k8s-ingress-wikikube-rw -netbox 300/10 IN DYNA metafo!disc-netbox +netbox 300/10 IN DYNA geoip!disc-netbox ; We don't need a separate discovery address for netbox-extra ; however a new cname is useful to configure an internal vhost netbox-exports 300 IN CNAME netbox diff --git utils/mock_etc/discovery-geo-resources utils/mock_etc/discovery-geo-resources index 9c418ae0..f0737643 100644 --- utils/mock_etc/discovery-geo-resources +++ utils/mock_etc/discovery-geo-resources @@ -58,6 +58,7 @@ disc-helm-charts => { map => mock, dcmap => { mock => 192.0.2.1 } } disc-api-gateway => { map => mock, dcmap => { mock => 192.0.2.1 } } disc-similar-users => { map => mock, dcmap => { mock => 192.0.2.1 } } disc-linkrecommendation => { map => mock, dcmap => { mock => 192.0.2.1 } } +disc-netbox => { map => mock, dcmap => { mock => 192.0.2.1 } } disc-puppetdb-api => { map => mock, dcmap => { mock => 192.0.2.1 } } disc-puppetboard => { map => mock, dcmap => { mock => 192.0.2.1 } } disc-shellbox => { map => mock, dcmap => { mock => 192.0.2.1 } } diff --git utils/mock_etc/discovery-metafo-resources utils/mock_etc/discovery-metafo-resources index ca41fb4c..ceb840af 100644 --- utils/mock_etc/discovery-metafo-resources +++ utils/mock_etc/discovery-metafo-resources @@ -32,4 +32,3 @@ disc-parsoid-php => { datacenters => mock, dcmap => { mock => 192.0.2.1 disc-toolhub => { datacenters => mock, dcmap => { mock => 192.0.2.1 } } disc-k8s-ingress-staging => { datacenters => mock, dcmap => { mock => 192.0.2.1 } } disc-k8s-ingress-wikikube-rw => { datacenters => mock, dcmap => { mock => 192.0.2.1 } } -disc-netbox => { datacenters => mock, dcmap => { mock => 192.0.2.1 } } Merge these changes? (yes/no)? yes Updating f7bdb9d5..a2174663 Fast-forward templates/wmnet | 2 +- utils/mock_etc/discovery-geo-resources | 1 + utils/mock_etc/discovery-metafo-resources | 1 - 3 files changed, 2 insertions(+), 2 deletions(-) Deploying via utils/deploy-check.py... Assembling and testing data in /tmp/dns-check.6_bsr0e8 -- Generating zonefiles from zone templates -- Processed 213 zones into directory /tmp/dns-check.6_bsr0e8/zones OK: No tabs Summary of violations: W001|MISSING_IP_FOR_NAME_AND_PTR: 256 W002|MISSING_PTR_FOR_NAME_AND_IP: 30 W105|TOO_MANY_PUBLIC_NAMES: 11 RESULT: 0 Errors, 297 Warnings, 1811 Ignored violations, 43 Ignored lines -- Copying automatically generated zone files under target tree -- Copying repo-driven real config files and admin_state -- Copying puppetized config and GeoIP from /etc/gdnsd -- Checking for illegal tabs in zonefiles -- Running zone_validator to check WMF rules -- Running /usr/sbin/gdnsd checkconf on /tmp/dns-check.6_bsr0e8 Traceback (most recent call last): File "utils/deploy-check.py", line 283, in <module> main() File "utils/deploy-check.py", line 275, in main deploy_check(args.deploy, args.skip_reload, args.no_gdnsd, Path(tdir), gdir) File "utils/deploy-check.py", line 221, in deploy_check safe_cmd([GDNSD_BIN, '-c', str(tdir), 'checkconf']) File "utils/deploy-check.py", line 87, in safe_cmd p_err.decode('utf-8'))) Exception: Command /usr/sbin/gdnsd -c /tmp/dns-check.6_bsr0e8 checkconf failed with exit code 42, stderr: info: gdnsd version 3.8.0 @ pid 6244 info: DNS listener threads (8 UDP + 8 TCP) configured for 208.80.154.238:53 info: DNS listener threads (8 UDP + 8 TCP) configured for 208.80.153.231:53 info: DNS listener threads (8 UDP + 8 TCP) configured for 91.198.174.239:53 info: DNS listener threads (8 UDP + 8 TCP) configured for 198.35.27.27:53 info: DNS listener threads (8 TCP PROXY) configured for 127.0.0.1:535 info: DNS listener threads (1 UDP + 1 TCP) configured for 0.0.0.0:5353 info: DNS listener threads (1 UDP + 1 TCP) configured for [::]:5353 info: plugin_geoip: map 'generic-map': Loading GeoIP2 database '/tmp/dns-check.6_bsr0e8/geoip/GeoIP2-City.mmdb': Version: 2.0, Type: GeoIP2-City, IPVersion: 6, Timestamp: 2023-02-17 02:31:14 UTC info: plugin_geoip: map 'generic-map' runtime db updated. nets: 1214920 dclists: 18 info: plugin_geoip: map 'discovery-map': Loading GeoIP2 database '/tmp/dns-check.6_bsr0e8/geoip/GeoIP2-City.mmdb': Version: 2.0, Type: GeoIP2-City, IPVersion: 6, Timestamp: 2023-02-17 02:31:14 UTC info: plugin_geoip: map 'discovery-map' runtime db updated. nets: 512 dclists: 2 info: admin_state: checking state file '/tmp/dns-check.6_bsr0e8/state/admin_state'... error: plugin_geoip: Invalid resource name 'disc-netbox' detected from zonefile lookup error: Name 'netbox.discovery.wmnet.': resolver plugin 'geoip' rejected resource name 'disc-netbox' fatal: Initial load of zone data failed
i have reverted both changes but received a similar error when running puppet (likely due to the order of applying the changes). It would be useful for someone to check the state of DNS to ensure nothing is broken, then the priority of this task can be lowered
i have recreated the original changes for [[ DNS | https://gerrit.wikimedia.org/r/c/operations/dns/+/890384 ]] and the Service catalogue