When trying to restart StewardBot's k8s pod, I received the following internal server error:
tools.stewardbots@tools-sgebastion-10:~/stewardbots/StewardBot$ ./manage.sh restart
Restarting StewardBot pod...
ERROR: An internal error occured while executing this command.
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/toolforge_weld/api_client.py", line 71, in _make_request
response.raise_for_status()
File "/usr/lib/python3/dist-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: INTERNAL SERVER ERROR for url: https://api.svc.tools.eqiad1.wikimedia.cloud:30003/jobs/api/v1/restart/stewardbot
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/tjf_cli/cli.py", line 712, in main
run_subcommand(args=args, api=api)
File "/usr/lib/python3/dist-packages/tjf_cli/cli.py", line 659, in run_subcommand
op_restart(api, args.name)
File "/usr/lib/python3/dist-packages/tjf_cli/cli.py", line 586, in op_restart
api.post(f"/restart/{name}")
File "/usr/lib/python3/dist-packages/toolforge_weld/api_client.py", line 95, in post
return self._make_request("POST", url, **kwargs).json()
File "/usr/lib/python3/dist-packages/toolforge_weld/api_client.py", line 75, in _make_request
raise self.exception_handler(e)
tjf_cli.api.TjfCliHttpError: Internal Server Error
ERROR: Please report this issue to the Toolforge admins: https://w.wiki/6Zuu
tools.stewardbots@tools-sgebastion-10:~/stewardbots/StewardBot$ cat manage.sh
#!/usr/bin/env bash
# Management script for <del>stashbot</del> StewardBot kubernetes processes
# https://github.com/wikimedia/stashbot/blob/master/bin/stashbot.sh
set -e
TOOL_DIR=/data/project/stewardbots/stewardbots/StewardBot
JOB_NAME=stewardbot
JOB_FILE="${TOOL_DIR}/jobs.yaml"
LOG_FILE="/data/project/stewardbots/logs/stewardbot.log"
VENV=/data/project/stewardbots/venv-k8s-py39
case "$1" in
start)
echo "Starting StewardBot k8s deployment..."
toolforge-jobs load "${JOB_FILE}" --job "${JOB_NAME}"
;;
run)
date +%Y-%m-%dT%H:%M:%S
echo "Starting StewardBot..."
source ${VENV}/bin/activate
cd ${TOOL_DIR}
exec python StewardBot.py
;;
stop)
echo "Stopping StewardBot k8s deployment..."
toolforge-jobs delete "${JOB_NAME}"
# FIXME: wait for the pods to stop
;;
restart)
echo "Restarting StewardBot pod..."
toolforge-jobs restart "${JOB_NAME}"
;;
status)
toolforge-jobs show "${JOB_NAME}"
;;
tail)
exec tail -f "${LOG_FILE}"
;;
*)
echo "Usage: $0 {start|stop|restart|status|tail}"
exit 1
;;
esac
exit 0
# vim:ft=sh:sw=4:ts=4:sts=4:et:
tools.stewardbots@tools-sgebastion-10:~/stewardbots/StewardBot$Reporting to Toolforge admins, as the command told me so. FWIW, ./manage.sh stop && ./manage.sh start seems to have worked properly.