Page MenuHomePhabricator

puppet failling on tools-sge-bastion-10
Closed, ResolvedPublic

Description

As I write this: The last Puppet run was at Tue May 9 21:44:58 UTC 2023 (174 minutes ago).

May  9 22:14:57 tools-sgebastion-10 puppet-agent[1732]: Unable to fetch my node definition, but the agent run will continue:
May  9 22:14:57 tools-sgebastion-10 puppet-agent[1732]: Error 500 on SERVER: Server Error: Failed to find tools-sgebastion-10.tools.eqiad1.wikimedia.cloud via exec: Execution of '/usr/local/bin/puppet-enc tools-sgebastion-10.tools.eqiad1.wikimedia.cloud' returned 1: 
May  9 22:14:57 tools-sgebastion-10 puppet-agent[1732]: Retrieving pluginfacts
May  9 22:14:57 tools-sgebastion-10 puppet-agent[1732]: Retrieving plugin
May  9 22:14:57 tools-sgebastion-10 puppet-agent[1732]: Retrieving locales
May  9 22:14:58 tools-sgebastion-10 puppet-agent[1732]: Loading facts
May  9 22:14:59 tools-sgebastion-10 puppet-agent[1732]: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Failed when searching for node tools-sgebastion-10.tools.eqiad1.wikimedia.cloud: Failed to find tools-sgebastion-10.tools.eqiad1.wikimedia.cloud via exec: Execution of '/usr/local/bin/puppet-enc tools-sgebastion-10.tools.eqiad1.wikimedia.cloud' returned 1: 
May  9 22:14:59 tools-sgebastion-10 puppet-agent[1732]: Not using cache on failed catalog
May  9 22:14:59 tools-sgebastion-10 puppet-agent[1732]: Could not retrieve catalog; skipping run

Event Timeline

Andrew claimed this task.
Andrew subscribed.

Thank you for reporting this! It was a result of the enc api server crashing. I fixed it by running

# systemctl restart puppet-enc

on enc-1.cloudinfra.eqiad1.wikimedia.cloud

I had to tho this again today an hour or so ago, same VM