Page MenuHomePhabricator

Puppet agent unable to run in Beta Cluster (Evaluation Error: Error while evaluating a Resource Statement)
Closed, ResolvedPublic

Description

Since last Friday, puppet agent no longer runs in Beta Cluster, making it not possible to cherry-pick operations changes and test them there, e.g. https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/559262/.

Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Error: Error while evaluating a Function Call, Class[Profile::Conftool::Client]: parameter 'srv_domain' expects a match for Variant[Stdlib::Fqdn = Pattern[/^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])$/], Stdlib::Compat::Ipv4 = Pattern[/^((([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}(?!\d)|2[0-4][0-9](?!\d)|25[0-5](?!\d))[.]){3}([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}(?!\d)|2[0-4][0-9](?!\d)|25[0-5](?!\d)))(\/((([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}(?!\d)|2[0-4][0-9](?!\d)|25[0-5](?!\d))[.]){3}([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}(?!\d)|2[0-4][0-9](?!\d)|25[0-5](?!\d))|[0-9]+))?$/], Stdlib::Compat::Ipv6 = Pattern[/\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*$/]], got '' at /etc/puppet/modules/profile/manifests/lvs/realserver.pp:39:9 on node deployment-mediawiki-07.deployment-prep.eqiad.wmflabs

Event Timeline

the etcd_client_srv_domain is not set in cloud/deployment-prep. seems this needs to be added in ./hieradata/labs/deployment-prep/common.yaml

hrm, is this possibly from 06abb84939331458b1187fd9c3a562b82731b329 by @jbond ? Not exactly sure what lvs/realserver.pp is attempting to pass here, but it's making type checking angry it seems.

"parameter 'srv_domain' expects a match for Variant[Stdlib::Fqdn" that is the new thing. until recently the parameter did not have a data type. so it was accepted when it was simply not set. but now it is more strict and expects an FQDN here.

For production that value is "conftool.%{::site}.wmnet"and in `hieradata/common.yaml.
`
The question is what value to use for cloud (or to change code to allow it to be undefined).

Change 564116 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] hieradata/labs: add etcd srv_domain parameter for deployment-prep

https://gerrit.wikimedia.org/r/564116

Change 564116 merged by Dzahn:
[operations/puppet@production] hieradata/labs: add etcd srv_domain parameter for deployment-prep

https://gerrit.wikimedia.org/r/564116

The above patch is live on the beta cluster puppet master:

deployment-puppetmaster03
git l
* … local cherry picks (HEAD) …
* –
* a83f55682f - (origin/production, origin/HEAD) codesearch: fix parameters of apt::package_from:component (53 minutes ago) <Daniel Zahn>
* 6b9d61fd99 - codesearch: Install docker-ce from thirdparty/kubeadm-k8s component (60 minutes ago) <Kunal Mehta>
* 23f4fab5f4 - microsites: replace hiera() with lookup() (65 minutes ago) <Daniel Zahn>
* f9cbd1a861 - releases: replace hiera() with lookup() (71 minutes ago) <Daniel Zahn>
* b51e074fd3 - hieradata/labs: add etcd srv_domain parameter for deployment-prep (5 hours ago) <Daniel Zahn>
* …

But puppet agent still fails the same way:

deployment-mediawiki-07
deployment-mediawiki-07:/var/log$ sudo puppet agent -tv
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Error: Error while evaluating a Function Call, Class[Profile::Conftool::Client]: parameter 'srv_domain' expects a match for Variant[Stdlib::Fqdn = Pattern[/^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])$/], Stdlib::Compat::Ipv4 = Pattern[/^((([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}(?!\d)|2[0-4][0-9](?!\d)|25[0-5](?!\d))[.]){3}([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}(?!\d)|2[0-4][0-9](?!\d)|25[0-5](?!\d)))(\/((([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}(?!\d)|2[0-4][0-9](?!\d)|25[0-5](?!\d))[.]){3}([0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}(?!\d)|2[0-4][0-9](?!\d)|25[0-5](?!\d))|[0-9]+))?$/], Stdlib::Compat::Ipv6 = Pattern[/\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*$/]], got '' at /etc/puppet/modules/profile/manifests/lvs/realserver.pp:39:9 on node deployment-mediawiki-07.deployment-prep.eqiad.wmflabs
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

It didn't work because at the Horizon layer there was a project-level override for this Hiera. This can be edited via https://horizon.wikimedia.org/project/puppet/ and also seen publicly via Gitiles.

The override was added in November 2019, probably because something used the key but not in a significant way yet, so as to overcome a puppet failure at the time. Which expains why the override used the bogus empty string:

etcd_client_srv_domain: ''
Krinkle claimed this task.

Puppet agent now runs cleanly in Beta.

@Krinkle cool, thanks for finding the additional override :)