Puppet runs on tools-webgrid-06 took longer than usual, probably due to some issue with virt1000; this caused puppet-run's timeout to kick in and kill the Puppet run:
[…] Notice: Caught TERM; calling stop
Subsequent runs of puppet-run found the stale lock file and stopped short of starting Puppet:
Skipping this run, puppet agent already running at pid 4685
However, there was no process running as pid 4685, and manual puppet agent -tv invocations started happily (with their own locking working properly, i. e. "Notice: Run of Puppet configuration client already in progress; skipping (/var/lib/puppet/state/agent_catalog_run.lock exists)").
The locking check was added in https://gerrit.wikimedia.org/r/#/c/196162/; I don't understand what it is meant to protect against or how it would do that, but it's not working at the moment.