Page MenuHomePhabricator

Migrate wmcz from Toolforge GridEngine to Toolforge Kubernetes
Closed, ResolvedPublic

Description

Kindly migrate your tool(https://grid-deprecation.toolforge.org/t/wmcz) from Toolforge GridEngine to Toolforge Kubernetes.

Toolforge GridEngine is getting deprecated.
See: https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/

Please note that a volunteer may perform this migration if this has not been done after some time.
If you have already migrated this tool, kindly mark this as resolved.

If you would rather shut down this tool, kindly do so and mark this as resolved.

Useful Resources:
Migrating Jobs from GridEngine to Kubernetes
https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework#Grid_Engine_migration
Migrating Web Services from GridEngine to Kubernetes
https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Move_a_grid_engine_webservice
Python
https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Rebuild_virtualenv_for_python_users

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

My apologies if this ticket comes as a surprise to you. In order to ensure WMCS can provide a stable, secure and supported platform, it’s important we migrate away from GridEngine. I want to assure you that while it is WMCS’s intention to shutdown GridEngine as outlined in the blog post https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/, a shutdown date for GridEngine has not yet been set. The goal of the migration is to migrate as many tools as possible onto kubernetes and ensure as smooth a transition as possible for everyone. Once the majority of tools have migrated, discussion on a shutdown date is more appropriate. See T314664: [infra] Decommission the Grid Engine infrastructure.

As noted in https://techblog.wikimedia.org/2022/03/16/toolforge-gridengine-debian-10-buster-migration/ some use cases are already supported by kubernetes and should be migrated. If your tool can migrate, please do plan a migration. Reach out if you need help or find you are blocked by missing features. Most of all, WMCS is here to support you.

However, it’s possible your tool needs a mixed runtime environment or some other features that aren't yet present in https://techblog.wikimedia.org/2022/03/18/toolforge-jobs-framework/. We’d love to hear of this or any other blocking issues so we can work with you once a migration path is ready. Thanks for your hard work as volunteers and help in this migration!

Hello @komla, @nskaggs and the rest of WMCS team,

I spent a while playing with the new jobs framework and tools.wmcz and I ran into an issue during the migration. One of the cron jobs included in this tool is a shell script (~tools.wmcz/analytics/orchestrator/daily.sh), which (among other things) runs the following commands:

wget -O /tmp/$$/$dataset.sql.gz https://files.wikimedia.cz/datasets/$dataset.sql.gz
zcat /tmp/$$/$dataset.sql.gz | mysql s53887__wmcz_${dataset}_p

Since the script also runs a Python script, I first tried tf-python39, where there is no wget. I was unable to find wget installed in other containers, too. By shelling into the jobs container, I also figured that mysql is missing as well.

How can I migrate similar simple shell scripts (in tools.wmcz and elsewhere), please? Thanks!

Since the script also runs a Python script, I first tried tf-python39, where there is no wget. I was unable to find wget installed in other containers, too. By shelling into the jobs container, I also figured that mysql is missing as well.

How can I migrate similar simple shell scripts (in tools.wmcz and elsewhere), please? Thanks!

I checked briefly, the script code is as follow:

#!/bin/bash

scriptdir="`dirname \"$0\"`"
cd $scriptdir

bash benes-datasets/daily.sh <--- the one that requires MYSQL
bash dashboard-data/daily.sh <--- the one that requires python

Since each of the scripts have a different dependencies, perhaps you could just run them in different jobs with different images, you can do something like:

daily at 12:00
tools.wmcz@tools-sgebastion-11:~$ toolforge-jobs run generate-dataset-daily --command analytics/orchestrator/benes-datasets/daily.sh --image bullseye --schedule "00 12 * * *"
daily at 12:30
tools.wmcz@tools-sgebastion-11:~$ toolforge-jobs run generate-dashboard-daily --command analytics/orchestrator/dashboard-data/daily.sh --image python3.9 --schedule "30 12 * * *"

Of course, there could be additional considerations, for example if you don't know how long they will take or if they are highly dependent, or prone to race conditions (i.e, what happens if the first job takes very long and the second starts before the first finishes).
A more robust solution to this would be to craft a container image with all the dependencies you need to run your tool. And good news is that we're already working on it, see T267374: [tbs.beta] Create a toolforge build service beta release (just not available yet)

Hi @aborrero,

thanks for the answer! It's possible I'm missing something here, but I don't understand how the bullseye image would help in this case. As far as I can see, the bullsyeye image has neither mysql nor wget available (both of those utilities are needed by the script):

tools.wmcz@tools-sgebastion-10 ~
$ toolforge-jobs run t320178 --command 'sleep 3600' --image bullseye
tools.wmcz@tools-sgebastion-10 ~
$ kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE
t320178-tzscx           1/1     Running   0          22s
wmcz-74b55bff54-9pp4c   1/1     Running   1          127d
tools.wmcz@tools-sgebastion-10 ~
$ kubectl exec -it t320178-tzscx -- bash
tools.wmcz@t320178-tzscx:~$ wget
bash: wget: command not found
tools.wmcz@t320178-tzscx:~$ mysql
bash: mysql: command not found
tools.wmcz@t320178-tzscx:~$

Running the job using the first command you listed (w/o the --schedule parameter, to trigger an one-off job) also results in "wget: command not found" type of errors. Is there something I miss, to be able to run those jobs at least separately?

In any way, T267374: [tbs.beta] Create a toolforge build service beta release sounds promising, and I'm looking forward to its release!

Change 889769 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/docker-images/toollabs-images@master] bullseye-sssd/: add mysql client command line utility

https://gerrit.wikimedia.org/r/889769

thanks for the answer! It's possible I'm missing something here, but I don't understand how the bullseye image would help in this case. As far as I can see, the bullsyeye image has neither mysql nor wget available (both of those utilities are needed by the script):

You are right.

For wget, you can use curl instead, which is included in the container image.

For mysql, you can either wait and see if https://gerrit.wikimedia.org/r/889769 gets merged, or download some mysql binary by hand to your tool directory, something like curl https://cdn.mysql.com//Downloads/MySQL-Shell/mysql-shell-8.0.32-linux-glibc2.12-x86-64bit.tar.gz

Change 889769 abandoned by Arturo Borrero Gonzalez:

[operations/docker-images/toollabs-images@master] bullseye-sssd/: add mysql client command line utility

Reason:

merging https://gerrit.wikimedia.org/r/c/operations/docker-images/toollabs-images/+/842993 instead

https://gerrit.wikimedia.org/r/889769

Mentioned in SAL (#wikimedia-cloud) [2023-02-17T10:25:01Z] <arturo> build and push mariadb-sssd/base docker image for Toolforge (T320178, T254636)

We have now a new image available which contains both curl and some mysql client tools, see:

tools.arturo-test-tool@tools-sgebastion-11:~$ toolforge-jobs images
Short name    Container image URL
------------  ----------------------------------------------------------------------
bullseye      docker-registry.tools.wmflabs.org/toolforge-bullseye-sssd:latest
golang1.11    docker-registry.tools.wmflabs.org/toolforge-golang111-sssd-base:latest
jdk17         docker-registry.tools.wmflabs.org/toolforge-jdk17-sssd-base:latest
mariadb       docker-registry.tools.wmflabs.org/toolforge-mariadb-sssd-base:latest
mono6.8       docker-registry.tools.wmflabs.org/toolforge-mono68-sssd-base:latest
node16        docker-registry.tools.wmflabs.org/toolforge-node16-sssd-base:latest
perl5.32      docker-registry.tools.wmflabs.org/toolforge-perl532-sssd-base:latest
php7.4        docker-registry.tools.wmflabs.org/toolforge-php74-sssd-base:latest
python3.9     docker-registry.tools.wmflabs.org/toolforge-python39-sssd-base:latest
ruby2.1       docker-registry.tools.wmflabs.org/toolforge-ruby21-sssd-base:latest
ruby2.7       docker-registry.tools.wmflabs.org/toolforge-ruby27-sssd-base:latest
tcl8.6        docker-registry.tools.wmflabs.org/toolforge-tcl86-sssd-base:latest
tools.arturo-test-tool@tools-sgebastion-11:~$ toolforge-jobs run mariadb --command 'sleep 3600' --image mariadb
tools.arturo-test-tool@tools-sgebastion-11:~$ kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE
mariadb-49nnz           1/1     Running   0          8s
test-6d76568d94-mbzrh   1/1     Running   1          23d
test2-bcb6c74d9-6ncd7   1/1     Running   0          3d16h
tools.arturo-test-tool@tools-sgebastion-11:~$ kubectl exec -it mariadb-49nnz -- bash
tools.arturo-test-tool@mariadb-49nnz:~$ curl -h
Usage: curl [options...] <url>
[..]
tools.arturo-test-tool@mariadb-49nnz:~$ sql -h
usage: sql [-h] [-v] [-N] [--cluster {analytics,web}] DATABASE ...
[..]
tools.arturo-test-tool@mariadb-49nnz:~$ mysql --help
mysql  Ver 15.1 Distrib 10.5.18-MariaDB, for debian-linux-gnu (x86_64) using  EditLine wrapper
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Usage: mysql [OPTIONS] [database]
[..]

Hope this helps.

Mentioned in SAL (#wikimedia-cloud) [2023-08-26T18:44:08Z] <urbanecm> remove user crontab, preparation for moving to k8s (T320178)

Should be done:

tools.wmcz@tools-sgebastion-10 ~ 
$ toolforge-jobs list
Job name:                       Job type:              Status:
------------------------------  ---------------------  --------------------------
datasets-dashboard-data         schedule: 0 9 * * *    Waiting for scheduled time
datasets-import-benes-datasets  schedule: 0 9 * * *    Waiting for scheduled time
generate-news-last-month        schedule: 0 10 15 * *  Waiting for scheduled time
generate-news-mainpage          schedule: 0 11 15 * *  Waiting for scheduled time
tools.wmcz@tools-sgebastion-10 ~ 
$