Following the rolling restart of the #cloud-vps infrastructure on 2018-06-06, a large number of Kubernetes powered webservices in #Toolforge remained in an unavailable state. Some spot checking revealed that a large number of pods (the unit of work for running a webservice Docker container on Kubernetes) were in the `CrashLoopBackOff`. This means that the pod had started, died, and been restarted several times. See initial list at: {P7220}
Spot checking found that a large number of these looping pods were failing due to a missing mount of the `/etc/wmcs-project` into the Docker container. The `webservice-runner` command checks this file to determine which project it is running in (tools vs tools-beta). When not found the webservice-runner script dies which in turn kills the Docker container.