Missing k8s creds for the ores-inspect tool were reported on irc. Investigation showed that the maintain-kubeusers pod was in CrashLoopBackOff. The error it reported before dying each time was:
$ kubectl logs -f maintain-kubeusers-7f7b44754c-sffzd -n maintain-kubeusers starting a run CSR for tool-ores-inspect already exists, deleting Traceback (most recent call last): File "/app/maintain_kubeusers/k8s_api.py", line 152, in generate_csr self.create_new_csr(private_key, user, org_name) File "/app/maintain_kubeusers/k8s_api.py", line 145, in create_new_csr self.certs.create_certificate_signing_request(body=csr_body) File "/app/venv/lib/python3.7/site-packages/kubernetes/client/api/certificates_v1beta1_api.py", line 57, in create_certificate_signing_request (data) = self.create_certificate_signing_request_with_http_info(body, **kwargs) # noqa: E501 File "/app/venv/lib/python3.7/site-packages/kubernetes/client/api/certificates_v1beta1_api.py", line 141, in create_certificate_signing_request_with_http_info collection_formats=collection_formats) File "/app/venv/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 345, in call_api _preload_content, _request_timeout) File "/app/venv/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 176, in __call_api _request_timeout=_request_timeout) File "/app/venv/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 388, in request body=body) File "/app/venv/lib/python3.7/site-packages/kubernetes/client/rest.py", line 278, in POST body=body) File "/app/venv/lib/python3.7/site-packages/kubernetes/client/rest.py", line 231, in request raise ApiException(http_resp=r) kubernetes.client.rest.ApiException: (409) Reason: Conflict HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Fri, 26 Feb 2021 19:52:10 GMT', 'Content-Length': '306'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"certificatesigningrequests.certificates.k8s.io \"tool-ores-inspect\" already exists","reason":"AlreadyExists","details":{"name":"tool-ores-inspect","group":"certificates.k8s.io","kind":"certificatesigningrequests"},"code":409} During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/app/maintain_kubeusers.py", line 7, in <module> runpy.run_module("maintain_kubeusers", run_name="__main__") File "/usr/lib/python3.7/runpy.py", line 208, in run_module return _run_code(code, {}, init_globals, run_name, mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/app/maintain_kubeusers/__main__.py", line 7, in <module> main() File "/app/maintain_kubeusers/cli.py", line 163, in main tools, cur_users["tools"], k8s_api, args.gentle_mode File "/app/maintain_kubeusers/utils.py", line 42, in process_new_users k8s_api.generate_csr(user_list[user_name].pk, user_name) File "/app/maintain_kubeusers/k8s_api.py", line 162, in generate_csr user, body=client.V1DeleteOptions() File "/app/venv/lib/python3.7/site-packages/kubernetes/client/api/certificates_v1beta1_api.py", line 168, in delete_certificate_signing_request (data) = self.delete_certificate_signing_request_with_http_info(name, **kwargs) # noqa: E501 File "/app/venv/lib/python3.7/site-packages/kubernetes/client/api/certificates_v1beta1_api.py", line 261, in delete_certificate_signing_request_with_http_info collection_formats=collection_formats) File "/app/venv/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 345, in call_api _preload_content, _request_timeout) File "/app/venv/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 176, in __call_api _request_timeout=_request_timeout) File "/app/venv/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 411, in request body=body) File "/app/venv/lib/python3.7/site-packages/kubernetes/client/rest.py", line 268, in DELETE body=body) File "/app/venv/lib/python3.7/site-packages/kubernetes/client/rest.py", line 231, in request raise ApiException(http_resp=r) kubernetes.client.rest.ApiException: (404) Reason: Not Found HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Fri, 26 Feb 2021 19:52:10 GMT', 'Content-Length': '286'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"certificatesigningrequests.certificates.k8s.io \"ores-inspect\" not found","reason":"NotFound","details":{"name":"ores-inspect","group":"certificates.k8s.io","kind":"certificatesigningrequests"},"code":404}
Manual inspection of CSRs showed:
$ kubectl get csr NAME AGE REQUESTOR CONDITION tool-ores-inspect 3h33m system:serviceaccount:maintain-kubeusers:user-maintainer Approved,Issued
There is a mismatch in the code: the cleanup that tries to delete a pending CSR is not adding the "tool-" prefix to the expected CSR name, so it fails to delete it. Manually running kubectl delete csr/tool-ores-inspect unblocked things and got the service running as expected again.