agent_catalog_run.lock to be left and puppet to not start running again
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Krenair
	Feb 20 2016, 7:16 PM

Description

Work around https://tickets.puppetlabs.com/browse/PUP-1070 (fixed in 3.7.0, our precise/trusty hosts are on 3.4.3 - only jessie is okay) - maybe <ori> by provisioning, say, a script to rm /var/lib/puppet/state/agent_catalog_run.lock on boot, or a cron job that removes it if there are no puppet processes running

Event Timeline

Krenair created this task.Feb 20 2016, 7:16 PM

Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptFeb 20 2016, 7:16 PM

Krenair updated the task description. (Show Details)Feb 20 2016, 7:20 PM

Automatically cleaning up such files makes me feel uneasy. In Labs, I've seen this scenario (as I understand it) only when instances froze or were rebooted during a Puppet run. Shinken and puppetalert.py will inform the project administrators about Puppet staleness, and the freeze/reboot is usually fresh in memory so it the reason is obvious when looking at the lock file. I wouldn't mind if the clean-up remained a manual task.

could we hook into molly-guard? That already runs when a user tries to reboot the machine and makes you type the hostname to confirm you are sure. Maybe that could stop puppet before the reboot? Or at least tell the user to do that.

molly-guard only works interactively, and I'm not sure if the problems are connected to a "regular" shutdown (i. e., calling /usr/sbin/shutdown) or if OpenStack just pulls the VM's power plug.

Anyhow, removing the file after (re-)boot could be done via an Upstart job. But I'm still not convinced that automatically doing that is a good idea.

I don't remember experiencing this very often in production, anyways removing the file on boot if it exists seems easy enough

This was eventually resolved, we're running puppet 3.8 even on trusty nowadays

Reboot during puppet run causes /var/lib/puppet/state/agent_catalog_run.lock to be left and puppet to not start running againClosed, ResolvedPublicActions

Description

Event Timeline

Reboot during puppet run causes /var/lib/puppet/state/agent_catalog_run.lock to be left and puppet to not start running again
Closed, ResolvedPublic
Actions