Page MenuHomePhabricator

Show more useful information when mwscript-k8s fails to launch
Closed, ResolvedPublic

Description

I was confused by the lack of a wiki argument in T368966#9947478 (it turns out there’s a hard-coded list of wiki-less scripts, FWIW), so I ran:

lucaswerkmeister-wmde@deploy1002 ~ $ mwscript-k8s --attach shell.php
⏳ Starting shell.php on Kubernetes...
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/mw-script/eqiad.yaml"
skipping missing values file matching "values-eqiad.yaml"
skipping missing values file matching "/etc/helmfile-defaults/private/main_services/mw-script/eqiad.yaml"
skipping missing values file matching "values-eqiad.yaml"
Release "r72z2aop" does not exist. Installing it now.
NAME: r72z2aop
LAST DEPLOYED: Wed Jul  3 08:40:17 2024
NAMESPACE: mw-script
STATUS: deployed
REVISION: 1
NOTES:


r72z2aop	mw-script	1       	2024-07-03 08:40:17.331229287 +0000 UTC	deployed	mediawiki-0.6.35	           

⏳ Waiting for the container to start...
🚀 Job is running.
📜 Attaching to stdin/stdout:
error: unable to upgrade connection: container mediawiki-r72z2aop-app not found in pod mw-script.eqiad.r72z2aop-s4p8r_mw-script
☠️ Command failed with status 1: ['/usr/bin/kubectl', 'attach', '--quiet', 'job/mw-script.eqiad.r72z2aop', '--container', 'mediawiki-r72z2aop-app', '-it']

The error message was not very helpful; I had to figure out for myself how to find out what happened to the script and see its output:

lucaswerkmeister-wmde@deploy1002 ~ $ kube_env mw-script eqiad
lucaswerkmeister-wmde@deploy1002 ~ $ kubectl get pods
NAME                             READY   STATUS   RESTARTS   AGE
mw-script.eqiad.r72z2aop-s4p8r   0/4     Error    0          42s
lucaswerkmeister-wmde@deploy1002 ~ $ kubectl logs mw-script.eqiad.r72z2aop-s4p8r
error: a container name must be specified for pod mw-script.eqiad.r72z2aop-s4p8r, choose one of: [mediawiki-r72z2aop-app mediawiki-r72z2aop-mcrouter mediawiki-r72z2aop-tls-proxy mediawiki-r72z2aop-rsyslog]
lucaswerkmeister-wmde@deploy1002 ~ $ kubectl logs mw-script.eqiad.r72z2aop-s4p8r mediawiki-r72z2aop-app
Usage: mwscript scriptName.php --wiki=dbname
$ # ^ the actual error I was interested in

It would be nice if it printed a suitable kubectl logs command directly, similar to what it does for non-attached runs (IIRC).

Event Timeline

Change #1076893 had a related patch set uploaded (by RLazarus; author: RLazarus):

[operations/puppet@production] deployment_server: Print logs command when mwscript-k8s --attach fails

https://gerrit.wikimedia.org/r/1076893

Change #1076893 merged by RLazarus:

[operations/puppet@production] deployment_server: Print logs command when mwscript-k8s --attach fails

https://gerrit.wikimedia.org/r/1076893

rzl@deploy1003:~$ mwscript-k8s --attach -- shell.php
⏳ Starting shell.php on Kubernetes as job mw-script.codfw.9m47rjcq ...
⏳ Waiting for the container to start...
🚀 Job is running.
ℹ️ Expecting a prompt but don't see it? Due to a race condition, the beginning of the output might be missing. Try pressing enter.
📜 Attached to stdin/stdout:
error: unable to upgrade connection: container mediawiki-9m47rjcq-app not found in pod mw-script.codfw.9m47rjcq-fw2t7_mw-script
☠️ Command failed with status 1: /usr/bin/kubectl attach --quiet job/mw-script.codfw.9m47rjcq --container mediawiki-9m47rjcq-app -it
For logs (may not work) run:
K8S_CLUSTER=codfw KUBECONFIG=/etc/kubernetes/mw-script-deploy-codfw.config kubectl logs -f job/mw-script.codfw.9m47rjcq mediawiki-9m47rjcq-app

rzl@deploy1003:~$ K8S_CLUSTER=codfw KUBECONFIG=/etc/kubernetes/mw-script-deploy-codfw.config kubectl logs -f job/mw-script.codfw.9m47rjcq mediawiki-9m47rjcq-app
Usage: mwscript scriptName.php --wiki=dbname

Good suggestion, thanks!