Many grid engine jobs are executions of pywikibot scripts. An unknown but presumably large number of these are using some variation of the [[https://wikitech.wikimedia.org/wiki/Help:Toolforge/Pywikibot#Using_the_shared_Pywikibot_files_(recommended_setup)|recommended]] pywikibot process.
The WMCS team would like to find ways to reduce folks dependence on the grid engine. Making a simple way for folks to run pywikibot on the Kubernetes grid seems like a good place to start on this larger goal.
In response to a random musing on IRC, @JJMC89 reported that pywikibot makes a stable release ~4 times per year. This seems like an easy pace to keep up with even without a CI/CD system for updating our Docker images.
I'm thinking a first attempt at this could look something like a docker image based on whatever our latest py3 base container is with pywikibot, its direct dependencies, and most/all of the currently globally installed python3 packages included. This would be similar to the [[https://github.com/wikimedia/pywikibot/blob/master/Dockerfile|upstream Dockerfile]] with a few other bells and whistles.
This image should also do as many convention over configuration things as possible with the ultimate goal being that running a pywikibot workload on the Kubernetes cluster looks something like (fake commands ahead!):
```lines=10
$ become my-cool-pwb-tool
$ pwb-k8s init
Checking for local scripts directory...
Local scripts directory not found. Creating $HOME/pwb
Checking for user files...
User files not found. Running generate_user_files.py
[whatever generate_user_files.py does happens here]
$ pwb-k8s run version.py
Pywikibot: [https] r-pywikibot-core.git (df69134, g1, 2020/03/30, 11:17:54, OUTDATED)
Release version: 3.1.dev0
requests version: 2.12.4
cacerts: /etc/ssl/certs/ca-certificates.crt
certificate test: ok
Python: 3.5.3 (default, Sep 27 2018, 17:25:39)
[GCC 6.3.0 20170516]
$ vim pwb/my_cool_script.py
$ pwb-k8s cron --hour 3 --minute 17 my_cool_script.py
CronJob created
$ kubectl get cronjob
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
my_cool_script.py 17 3 * * * False 0 0m 0m
$ kubectl describe cronjob my_cool_script.py
Name: my_cool_script.py
Namespace: tool-my-cool-pwb-tool
Labels: name=my-cool-pwb-tool.my_cool_script.py
toolforge=tool
Annotations: <none>
Schedule: 17 3 * * *
Concurrency Policy: Allow
Suspend: False
Successful Job History Limit: 3
Failed Job History Limit: 1
Starting Deadline Seconds: <unset>
Selector: <unset>
Parallelism: <unset>
Completions: <unset>
Pod Template:
Labels: toolforge=tool
Containers:
bot:
Image: docker-registry.tools.wmflabs.org/toolforge-python37-sssd-pwb:latest
Port: <none>
Host Port: <none>
Args:
python3
/data/project/my-cool-pwb-tool/pwb/my_cool_script.py
Environment:
PYWIKIBOT_DIR: /data/project/my-cool-pwb-tool/pwb
HOME: /data/project/my-cool-pwb-tool
Mounts: <none>
Volumes: <none>
Last Schedule Time: <none>
Active Jobs: <none>
Events: <none>
```