We had multiple occasions of helm releases being in an unclean state, leaving deployers confronted with error messages like:
command "/usr/bin/helm3" exited with non-zero status STDERR: Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
This can happen on ^C during helmfile apply, terminated connections etc. and looks a bit spooky at first as "helm list" will return no releases.
root@deploy1002:~# kube_env admin staging root@deploy1002:~# helm -n eventstreams list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION root@deploy1002:~# helm -n eventstreams list --all NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION production eventstreams 4 2022-03-17 21:26:50.082844301 +0000 UTC pending-upgrade eventstreams-0.4.1 root@deploy1002:~# helm -n eventstreams history production REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION 1 Thu Nov 4 13:14:20 2021 superseded eventstreams-0.3.3 Install complete 2 Thu Jan 27 16:27:35 2022 superseded eventstreams-0.4.0 Upgrade complete 3 Wed Mar 2 18:11:54 2022 deployed eventstreams-0.4.1 Upgrade complete 4 Thu Mar 17 21:26:50 2022 pending-upgrade eventstreams-0.4.1 Preparing upgrade root@deploy1002:~# kubectl -n eventstreams get secret --field-selector 'type=helm.sh/release.v1' NAME TYPE DATA AGE sh.helm.release.v1.production.v1 helm.sh/release.v1 1 223d sh.helm.release.v1.production.v2 helm.sh/release.v1 1 138d sh.helm.release.v1.production.v3 helm.sh/release.v1 1 104d sh.helm.release.v1.production.v4 helm.sh/release.v1 1 89d
It would be nice to alert on such cases, which I think could be identified by periodic runs of something like:
root@deploy1002:~# helm list -A --failed --pending NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION main eventstreams-internal 4 2022-03-17 21:36:11.869687569 +0000 UTC pending-upgrade eventstreams-0.4.1 production eventstreams 4 2022-03-17 21:26:50.082844301 +0000 UTC pending-upgrade eventstreams-0.4.1
I'm currently not 100% certain that helm does the right thing here, as "helm list -A --failed --pending --superseded" does only list the pending releases as well..might be a bug (or PEBCAK)
If you're coming here because you found yourself in this situation, the way out of it is rollback to the last "deployed" state (revision 3 in the example above), see: https://wikitech.wikimedia.org/wiki/Kubernetes/Deployments#Rolling_back_in_an_emergency