Page MenuHomePhabricator

Migrate grantmetrics from Toolforge GridEngine to Toolforge Kubernetes
Closed, ResolvedPublic

Description

Kindly migrate your tool(https://grid-deprecation.toolforge.org/t/grantmetrics) from Toolforge GridEngine to Toolforge Kubernetes.

Toolforge GridEngine is getting deprecated.
See: https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/

Please note that a volunteer may perform this migration if this has not been done after some time.
If you have already migrated this tool, kindly mark this as resolved.

If you would rather shut down this tool, kindly do so and mark this as resolved.

Useful Resources:
Migrating Jobs from GridEngine to Kubernetes
https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework#Grid_Engine_migration
Migrating Web Services from GridEngine to Kubernetes
https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Move_a_grid_engine_webservice
Python
https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Rebuild_virtualenv_for_python_users

Event Timeline

My apologies if this ticket comes as a surprise to you. In order to ensure WMCS can provide a stable, secure and supported platform, it’s important we migrate away from GridEngine. I want to assure you that while it is WMCS’s intention to shutdown GridEngine as outlined in the blog post https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/, a shutdown date for GridEngine has not yet been set. The goal of the migration is to migrate as many tools as possible onto kubernetes and ensure as smooth a transition as possible for everyone. Once the majority of tools have migrated, discussion on a shutdown date is more appropriate. See T314664: [infra] Decommission the Grid Engine infrastructure.

As noted in https://techblog.wikimedia.org/2022/03/16/toolforge-gridengine-debian-10-buster-migration/ some use cases are already supported by kubernetes and should be migrated. If your tool can migrate, please do plan a migration. Reach out if you need help or find you are blocked by missing features. Most of all, WMCS is here to support you.

However, it’s possible your tool needs a mixed runtime environment or some other features that aren't yet present in https://techblog.wikimedia.org/2022/03/18/toolforge-jobs-framework/. We’d love to hear of this or any other blocking issues so we can work with you once a migration path is ready. Thanks for your hard work as volunteers and help in this migration!

@MusikAnimal we are testing build service. Developers can now use buildpack images.
There's a quickstart guide here.

We're also looking for feedback from the community on this service.
If your schedule allows it, kindly take a look and let us know.

MusikAnimal changed the task status from Stalled to Open.Oct 24 2023, 11:23 PM

No longer stalled. I think for this task, a normal Toolforge job will work, since T254636 added support for mysqldump.

I am definitely going to give the Build Service a try, though! :)

As per above I was hoping to use a simple Toolforge job, but it looks like there's something amiss with the mariadb image:

tools.grantmetrics@tools-sgebastion-10:~$ toolforge-jobs run backup --command ./var/backups/backup.sh --image mariadb
tools.grantmetrics@tools-sgebastion-10:~$ cat backup.err 
mysqldump: Got error: 2002: "Can't connect to local server through socket '/run/mysqld/mysqld.sock' (2)" when trying to connect
ERROR 2002 (HY000): Can't connect to local server through socket '/run/mysqld/mysqld.sock' (2)

The relevant part of the script:

GMDIR=/data/project/grantmetrics
DATE=$(date +%Y-%m-%d)

# Dump and compress whole database.
mysqldump --defaults-extra-file=$GMDIR/replica.my.cnf \
  --single-transaction s53550__grantmetrics \
  | gzip -c > $GMDIR/var/backups/grantmetrics_$DATE.sql.gz

I can try using the Build Service instead, but as this is such a tiny backup script, I thought the vanilla Jobs framework would be better.

This is a reminder that the tool for which this ticket is created is still running on the Grid.
The grid is deprecated and all remaining tools need to migrate to Toolforge Kubernetes.

We've sent several emails to maintainers as we continue to make the move away from the Grid.
Many of the issues that have held users back from moving away from the Grid have been addressed in
the latest updates to Build Service. See: https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Changelog

You might find the following resources helpful in migrating your tool:

  1. https://wikitech.wikimedia.org/wiki/Help:Toolforge/Build_Service#Migrating_an_existing_tool
  2. https://wikitech.wikimedia.org/wiki/Help:Toolforge/Build_Service#Tutorials_for_popular_languages

Don't hesitate to reach out to us using this ticket or via any of our support channels

If you have already migrated this tool, kindly mark this ticket as 'resolved'
To do this, click on the 'Add Action' dropdown above the comment text box, select 'Change Status', then 'Resolved'.
Click 'Submit'

Thank you!

Declining in favor of T353217: Migrate Event Metrics db to Trove. I have commented out the grid engine cron job for now.

I'll look into T353217 soon. Until then, we may miss a few scheduled backups. I'm willing to risk it as in the ~5 years Event Metrics have been around, we haven't once needed to restore from a backup.

MusikAnimal changed the task status from Declined to Resolved.Dec 12 2023, 3:36 AM

Actually, I managed to figure it out :) T353217 is still valid, but not urgent.