Page MenuHomePhabricator

Receiving puppet run failure alert for instance where manual puppet runs complete fine
Closed, ResolvedPublic

Description

I am receiving a regular email with the subject Alert: puppet failed on maps-wma1.maps.eqiad.wmflabs. There was an issue - caused by me by pinning certain apt sources - that I resolved. Manual puppet runs now yield:

dschwen@maps-wma1:~$ sudo su -
root@maps-wma1:~# puppet agent -tv
info: Retrieving plugin
info: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb
info: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb
info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
info: Loading facts in /var/lib/puppet/lib/facter/lldp.rb
info: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb
info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
info: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb
info: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb
info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb
info: Loading facts in /var/lib/puppet/lib/facter/apt.rb
info: Caching catalog for maps-wma1.maps.eqiad.wmflabs
info: Applying configuration version '1457552575'
notice: Finished catalog run in 20.26 seconds
root@maps-wma1:~# echo $?
0

Which looks fine to me.

Event Timeline

Can you tell me when you fixed the puppet issue, and when you received your most recent email nag?

Last nag: ~13h ago. Fixed three days ago.

dschwen added a project: Cloud-Services.
dschwen added a subscriber: Andrew.
yuvipanda subscribed.

Just got another one

Received: from root by maps-wma1.maps.eqiad.wmflabs with local (Exim 4.76)
	(envelope-from <root@wmflabs.org>)
	id 1af1B1-0002vD-1l
	for xxxxxxxx@xxxxxx.xxx; Sun, 13 Mar 2016 08:15:03 +0000
Date: Sun, 13 Mar 2016 08:15:03 +0000
To: xxxxxxxx@xxxxxx.xxx
Subject: Alert:  puppet failed on maps-wma1.maps.eqiad.wmflabs

The emails are sent when puppet has not run for 24 hours. Specifically, the code checks the 'last_run' parameter in /var/lib/puppet/state/last_run_summary.yaml. That file on maps-wma1 indicates the last run was 1457552917 (=9 march 2016 @ 7:48pm). That seems to coincide with your manual run, so this suggests somehow the crontab is not working.

root@maps-wma1:~# bash -x /usr/local/sbin/puppet-run
+ set -e
+ touch /var/log/puppet.log
+ chmod 600 /var/log/puppet.log
++ puppet agent --configprint agent_catalog_run_lockfile
+ PUPPETLOCK='invalid parameter: agent_catalog_run_lockfile'

and this might have something to do with the puppet version:

root@maps-wma1:~# puppet agent --version
2.7.11

whereas

valhallasw@tools-bastion-05:~$ puppet agent --version
3.4.3

2.7.11 is the version packaged by ubuntu, while 3.4.3 is the one that should be installed (from apt.wm.o). I suggest upgrading -- that will probably make the crontab work as expected again.

Thanks, I upgraded puppet. Let's see if that makes me compliant again :-)

dschwen claimed this task.

No further emails received. Closing. Thanks!