Page MenuHomePhabricator

Migrate srwiki from Toolforge GridEngine to Toolforge Kubernetes
Closed, ResolvedPublic

Description

Kindly migrate your tool(https://grid-deprecation.toolforge.org/t/srwiki) from Toolforge GridEngine to Toolforge Kubernetes.

Toolforge GridEngine is getting deprecated.
See: https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/

Please note that a volunteer may perform this migration if this has not been done after some time.
If you have already migrated this tool, kindly mark this as resolved.

If you would rather shut down this tool, kindly do so and mark this as resolved.

Useful Resources:
Migrating Jobs from GridEngine to Kubernetes
https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework#Grid_Engine_migration
Migrating Web Services from GridEngine to Kubernetes
https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Move_a_grid_engine_webservice
Python
https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Rebuild_virtualenv_for_python_users

Event Timeline

My apologies if this ticket comes as a surprise to you. In order to ensure WMCS can provide a stable, secure and supported platform, it’s important we migrate away from GridEngine. I want to assure you that while it is WMCS’s intention to shutdown GridEngine as outlined in the blog post https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/, a shutdown date for GridEngine has not yet been set. The goal of the migration is to migrate as many tools as possible onto kubernetes and ensure as smooth a transition as possible for everyone. Once the majority of tools have migrated, discussion on a shutdown date is more appropriate. See T314664: [infra] Decommission the Grid Engine infrastructure.

As noted in https://techblog.wikimedia.org/2022/03/16/toolforge-gridengine-debian-10-buster-migration/ some use cases are already supported by kubernetes and should be migrated. If your tool can migrate, please do plan a migration. Reach out if you need help or find you are blocked by missing features. Most of all, WMCS is here to support you.

However, it’s possible your tool needs a mixed runtime environment or some other features that aren't yet present in https://techblog.wikimedia.org/2022/03/18/toolforge-jobs-framework/. We’d love to hear of this or any other blocking issues so we can work with you once a migration path is ready. Thanks for your hard work as volunteers and help in this migration!

I think I tried to do this on at least two occasions. I stopped the webservice and then started it again with k8s, but it seems like that didn't do the trick. Help would be appreciated.

I think I tried to do this on at least two occasions. I stopped the webservice and then started it again with k8s, but it seems like that didn't do the trick. Help would be appreciated.

Your webservice is running on Kubernetes, and has been for at least the last 90 days:

$ webservice status
DEPRECATED: 'python2' type is deprecated.
  See https://wikitech.wikimedia.org/wiki/Help:Toolforge/Kubernetes
  for currently supported types.
Your webservice of type python2 is running on backend kubernetes
$ kubectl get pods
NAME                      READY   STATUS    RESTARTS   AGE
srwiki-7fdf7c55f5-f2n64   1/1     Running   0          90d

The grid usage report for your tool shows multiple instances of a job named "sh" running in the last week. You have two different crontab entries that both end up creating jobs named "sh":

$ crontab -l
## For example, you can run a backup of all your user accounts
## at 5 a.m every week with:
## 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
##
## m h  dom mon dow   command
#PATH=/usr/local/bin:/usr/bin:/bin
1 0 * * * jsub sh /data/project/srwiki/brcl.sh &> /dev/null
8 6 * * 0,4 jsub sh /data/project/srwiki/reports.sh

https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework is the Kubernetes-powered replacement for cron in Toolforge. Your brcl.sh job uses curl and sed. This will not work with the 'tf-bullseye-std' image (I am going to file a ticket about that), but it should work if with a more full featured bullseye image like 'tf-python39'. Your reports.sh job runs a number of pywikibot scripts. These could also be run using the 'tf-python39' image along with a venv to manage the small number of external library dependencies for pywikibot. Or you could wait to see what we come up with for T249787: Create Docker image for Toolforge that is purpose built to run pywikibot scripts.

Alright, that means that I could move these cron jobs to Kubernetes once those issues are fixed?

Alright, that means that I could move these cron jobs to Kubernetes once those issues are fixed?

Correct. You can actually decide for yourself if you want to wait on T249787 for your pywikibot usage or not. Because you are already maintaining your own clone of Pywikibot in $HOME/pywikibot-core I think you could move on without waiting for that task which is really more intended for folks who want a replacement for the current /shared/pywikibot/ git clone used by some grid engine tools.

You should also be able move the brcl.sh job right now without too much difficulty. You only need to use an image such as tf-python39 which includes both curl and sed. I was able to run that job manually via toolforge-jobs run brcl --command ./brcl.sh --image tf-python39 to confirm it works with that image.

https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework#Loading_jobs_from_a_YAML_file may be helpful for your work towards moving these jobs.

I did the switchover -- commented out the two crontab jobs and created new ones with the toolforge-jobs command. Hopefully this should do it.

Tonight I'll be able to check whether one of the jobs got executed or not. If so, I'll be free to close this. On Thursday morning the other job should run as well.

It didn't get executed for some reason. Or it did, but it encountered an error...

I get a "/bin/sh: 1: ./data/project/srwiki/brcl.sh: not found" error. Some help please

taavi subscribed.

The grid engine has been shut down, so I'm closing any remaining migration tasks as Declined. If you're still planning to migrate this tool, please re-open this task and add one or more active project tags to it. (If you need a project tag for your tool, those can be created via the Toolforge admin console.)

I get a "/bin/sh: 1: ./data/project/srwiki/brcl.sh: not found" error. Some help please

How are you running that script?

Never mind, I figured out that I have to run it with a relative path, like "./brcl.sh"

I'm figuring out now how to deal with python, pip and similar.

I'm constantly getting

ImportError: 
Pywikibot is missing a MediaWiki markup parser which is necessary.
Please update the required module with either

    pip install "mwparserfromhell>=0.5.0"

or

    pip install "wikitextparser>=0.47.5"

even though I installed those two with venv's pip.

Hmm, did that, but it didn't work (even after restarting the whole bash session). Still get the same error.

I installed the pywikibot framework from scratch using pip in venv, and after setting it up properly (I can log in and run pwb scripts in CLI), running a job yields a ModuleNotFoundError: No module named 'pywikibot'

Which exact commands are you using, both for creating the venv and for running a script?

I installed the pywikibot framework from scratch using pip in venv, and after setting it up properly (I can log in and run pwb scripts in CLI), running a job yields a ModuleNotFoundError: No module named 'pywikibot'

What are you doing differently when you use the cli than what your job does? One thing that can cause confusion is that files like $HOME/.profile and $HOME/.bashrc will not be executed in the jobs service. If that was how you chose to activate your venv you will need to think of another way. I personally like using the full or relative path to the venv's python interpreter rather than $PATH magic in the shell.

At least the virtual environment in /data/project/srwiki/pyvenv has been created on the bastion for Python 3.7. Python venvs are tied to the underlying system Python version, so in order to use a venv in a Python 3.11 container (the newest available) you would need to create the venv inside that container (either via a one-off toolforge job or via toolforge webservice shell).

I installed the pywikibot framework from scratch using pip in venv, and after setting it up properly (I can log in and run pwb scripts in CLI), running a job yields a ModuleNotFoundError: No module named 'pywikibot'

What are you doing differently when you use the cli than what your job does? One thing that can cause confusion is that files like $HOME/.profile and $HOME/.bashrc will not be executed in the jobs service. If that was how you chose to activate your venv you will need to think of another way. I personally like using the full or relative path to the venv's python interpreter rather than $PATH magic in the shell.

I don't use .bashrc or .profile. I followed the instructions at https://wikitech.wikimedia.org/wiki/Help:Toolforge/Python#For_Kubernetes_backend but in the end installed manually with pip some dependencies (requests, mysqlclient, wikitextparser, pywikibot). It is a bit strange that pwb is located in /pyvenv/lib/python3.7/site-packages/pywikibot instead of in the python3.11 folder.

I don't use .bashrc or .profile. I followed the instructions at https://wikitech.wikimedia.org/wiki/Help:Toolforge/Python#For_Kubernetes_backend but in the end installed manually with pip some dependencies (requests, mysqlclient, wikitextparser, pywikibot). It is a bit strange that pwb is located in /pyvenv/lib/python3.7/site-packages/pywikibot instead of in the python3.11 folder.

The link you gave last appeared in https://wikitech.wikimedia.org/w/index.php?title=Help:Toolforge/Python&direction=prev&oldid=2099972 and was removed by edits on 2023-08-15T22:18:21. The fundamentals of using a venv tied to the particular runtime Python container version have remained the same, but especially for pywikibot there are new possibilities and updated tutorials that can be followed.

It is a bit strange that pwb is located in /pyvenv/lib/python3.7/site-packages/pywikibot instead of in the python3.11 folder.

This is what @taavi was pointing out:

At least the virtual environment in /data/project/srwiki/pyvenv has been created on the bastion for Python 3.7. Python venvs are tied to the underlying system Python version, so in order to use a venv in a Python 3.11 container (the newest available) you would need to create the venv inside that container (either via a one-off toolforge job or via toolforge webservice shell).

Python 3.7 being used is almost certainly the result of you initially running python -m venv pyvenv from a Toolforge bastion directly. You will need to instead first enter a python3.11 command shell via webservice python3.11 shell if you want to make a venv that will work with a python3.11 runtime.

Ok, thanks, didn't know about the webservice shell. I setup the venv within that shell and it seems to be working now. Thanks so much. This is now fixed, so I can run my bots on a scheduled timeframe.