It has come up repeatedly that certain users, deployers with shell access but without global root, are missing the ability to temporarily disable puppet for testing purposes.
They are limited by puppet reverting their local changes they try to make to test something.
For SRE it is considered ok to temp. disable puppet for such a purpose as long as:
- it doesn't stay disabled for too long (we have Icinga alerting for this after a while, becomes a real issue after a week when it drops out of puppet DB, monitoring needs to be visible)
- certain hosts are made for testing and others are not. it's generally expected to use a "debug", "canary" "dev" or "test" host instead of the real deal where it's possible
The purpose of this task is to discuss this a bit but also treat it like a regular access request ticket.
So I am suggesting we pick a host or small group of hosts like mwdebug* where we allow this (not globally and specifically not bastion*.bastions should not be used for this) for a certain group like deployers, or a new group of 'puppet-disablers' if that seems more reasoanable.
Then we'd add a line to their existing sudo privileges to let them run "disable-puppet/enable-puppet <reason>" either directly or with a wrapper around it. (the one used by cumin?).
Finally would be nice to then have some more visibility in monitoring for "puppet has been disabled for too long" for just these hosts.
And / or something in big red letters in the MOTD and on IRC so people don't leave it disabled for too long and are aware when others currently have it disabled.