I just now broke the labspuppet backend API such that it failed to match any host. When that happens I'd like puppet runs to fail, but what happens instead is that puppet runs but assumes a default config. Can get pretty messy.
Description
Details
| Subject | Repo | Branch | Lines +/- | |
|---|---|---|---|---|
| cloud-vps: Change NFS mounts to default to false | operations/puppet | production | +2 -2 |
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | None | T262350 bad failure cases for wmcs custom puppet enc | |||
| Resolved | taavi | T267082 Rebuild Toolforge servers that should not have NFS mounted (and with affinity) | |||
| Resolved | dcaro | T267140 [toolsbeta] Rebuild servers to learn how to take down the services without downtime (and use affinities) | |||
| Resolved | taavi | T252239 Rebuild tools-k8s-haproxy-* as an anti-affinity server group | |||
| Resolved | taavi | T278541 Toolforge: migrate redis servers to Debian Buster or later | |||
| Resolved | taavi | T153810 Make switching Redis server simpler | |||
| Resolved | dcaro | T309014 sentinel and puppet overwriting toolforge redis config | |||
| Resolved | dcaro | T279723 Remove 2 nodes from the tools-k8s-etcd cluster |
Event Timeline
As far as I understand the bug that triggered this, from the client side it would have looked exactly the same as an instance with no roles applied. I'm not sure what kind of client validation check could have halted the runs really.
Just found a frustrating side effect of this. Hosts that are supposed to not have traffic shaping and NFS do because once that's applied there is no going back without lots of surgery on the VM or simply scrapping it and rebuilding.
On ingress and proxy nodes that makes them potentially problematic.
Can we not flip the default so that if the API fails, it defaults to not having NFS (which shouldn't remove anything I'd think) instead of adding NFS?
We totally could. However, that is a change in behavior for other Cloud tenants, so maybe we'd want to include commits to /labs/private for existing tenants as part of it so their behavior doesn't change unless they want it to. I do think "opt-in" NFS mounts makes much more sense.
Hmm, we don't even need to do that because things don't unmount. It only affects new VMs. Maybe we could flip the default with an announcement instead.
Change 639297 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] cloud-vps: Change NFS mounts to default to false
Change 639297 merged by Bstorm:
[operations/puppet@production] cloud-vps: Change NFS mounts to default to false
legoktm> could you have a enabled: true hiera key or something that the enc sets and which puppet will refuse to run if not present? to distinguish a barf with no hieradata from instance has no extra hieradata?
+1
I'd name it 'safeguard/safecheck/encgenerated/generatedbyenc: "hostname.of.the.enc"/generatedat: "YYYY-mm-dd HH:MM:SS" ...' or something a bit more meaningful, but I think it's a good idea :)