Page MenuHomePhabricator

Puppet fails only once when restarting ferm is not successful
Closed, ResolvedPublic

Description

When serviceferm restart as invoked by puppet fails, only the current puppet run is marked as failed. Subsequent puppet runs will succeed because ferm restart isn't invoked since rules haven't changed. This is normally ok but there are corner cases like @resolve failing for some reason (e.g. AAAAs missing in https://gerrit.wikimedia.org/r/#/c/337384/) where ferm failing to restart can go undetected.

For systems >= jessie we catch this failure via the generic systemd 'one or more units are failed' icinga check, though in other cases like on eventlog1001 (in https://phabricator.wikimedia.org/T157022#2997068) a failing ferm restart can go undetected.

Event Timeline

Ottomata triaged this task as Medium priority.Mar 6 2017, 7:38 PM
Ottomata subscribed.

Puppet can't just ensure => 'running' on the ferm service? Or is ferm a special case, and not a puppet service resource type?

I believe this has been fixed by the addition of the ferm-stastus script. but please re-open if im missing something

https://gerrit.wikimedia.org/r/c/operations/puppet/+/576101

jbond claimed this task.
jbond added a project: Puppet.