Seen today during the train presync for 1.42.0-wmf.21 (executed manually during morning UTC window)
Backscroll: P58470
Unlike T359114 the timeouts didn't get as far as parsoid and already happened for the K8s testservers:
STDERR: WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /etc/kubernetes/mw-debug-deploy-eqiad.config Error: UPGRADE FAILED: release pinkunicorn failed, and has been rolled back due to atomic being set: timed out waiting for the condition COMBINED OUTPUT: WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /etc/kubernetes/mw-debug-deploy-eqiad.config Error: UPGRADE FAILED: release pinkunicorn failed, and has been rolled back due to atomic being set: timed out waiting for the condition 11:30:12 Finished Running helmfile -e eqiad --selector name=pinkunicorn apply in /srv/deployment-charts/helmfile.d/services/mw-debug (duration: 10m 12s) 11:30:12 K8s deployment to stage testservers failed: K8s deployment had the following errors: codfw: Deployment of mw-misc-main failed: Command '['helmfile', '-e', 'codfw', '--selector', 'name=main', 'apply']' returned non-zero exit status 1. Deployment of mw-debug-pinkunicorn failed: Command '['helmfile', '-e', 'codfw', '--selector', 'name=pinkunicorn', 'apply']' returned non-zero exit status 1. eqiad: Deployment of mw-misc-main failed: Command '['helmfile', '-e', 'eqiad', '--selector', 'name=main', 'apply']' returned non-zero exit status 1. Deployment of mw-debug-pinkunicorn failed: Command '['helmfile', '-e', 'eqiad', '--selector', 'name=pinkunicorn', 'apply']' returned non-zero exit status 1. 11:30:12 Rolling back to prior state...
Also, no spike of resource requests can be seen at https://grafana-rw.wikimedia.org/d/pz5A-vASz/kubernetes-resources?orgId=1&var-ds=thanos&var-site=codfw&var-prometheus=k8s&from=now-24h&to=now during that period (which would have been surprising for the testservers anyway):

