Page MenuHomePhabricator

dns + puppet failures on many cloud hosts
Closed, ResolvedPublic

Description

Lots of cloud hosts can't run puppet right now. On cloudservices1006:

Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Function Call, failed ot resolv 'cloudsw-c8.private.eqiad.wikimedia.cloud' (file: /etc/puppet/modules/profile/manifests/wmcs/cloud_private_subnet.pp, line: 41, column: 55) on node cloudservices1006.eqiad.wmnet

On cloudservices1005:

Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Function Call, failed ot resolv 'cloudsw-d5.private.eqiad.wikimedia.cloud' (file: /etc/puppet/modules/profile/manifests/wmcs/cloud_private_subnet.pp, line: 41, column: 55) on node cloudservices1005.eqiad.wmnet
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

Seems to be broken as of about two and a half hours ago.

Event Timeline

Looks like those records were accidentally changed in rONED621240f76294700c501b02ac38d773892ae06d44, probably as a part of the cloudvirt mass address assignment. I'll fix those in Netbox.