We've seen maintain-kubeusers fail at least twice since we moved etcd to ceph (see T267966). This was caused by an etcd timeout causing it's k8s requests to fail, but it could not continue without manual intervention because it had completed creation of the CSR.
Fix this vulnerability by having it clean up its own messes or recognize the valid CSR and use it instead, whichever is easier or better.
This way, a failure causing a restart will restore functionality (also look for other areas where restart on failure won't work).