Page MenuHomePhabricator

Migrate chie-bot from Toolforge GridEngine to Toolforge Kubernetes
Closed, DeclinedPublic

Description

Kindly migrate your tool(https://grid-deprecation.toolforge.org/t/chie-bot) from Toolforge GridEngine to Toolforge Kubernetes.

Toolforge GridEngine is getting deprecated.
See: https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/

Please note that a volunteer may perform this migration if this has not been done after some time.
If you have already migrated this tool, kindly mark this as resolved.

If you would rather shut down this tool, kindly do so and mark this as resolved.

Useful Resources:
Migrating Jobs from GridEngine to Kubernetes
https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework#Grid_Engine_migration
Migrating Web Services from GridEngine to Kubernetes
https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Move_a_grid_engine_webservice
Python
https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Rebuild_virtualenv_for_python_users

Event Timeline

My apologies if this ticket comes as a surprise to you. In order to ensure WMCS can provide a stable, secure and supported platform, it’s important we migrate away from GridEngine. I want to assure you that while it is WMCS’s intention to shutdown GridEngine as outlined in the blog post https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/, a shutdown date for GridEngine has not yet been set. The goal of the migration is to migrate as many tools as possible onto kubernetes and ensure as smooth a transition as possible for everyone. Once the majority of tools have migrated, discussion on a shutdown date is more appropriate. See T314664: [infra] Decommission the Grid Engine infrastructure.

As noted in https://techblog.wikimedia.org/2022/03/16/toolforge-gridengine-debian-10-buster-migration/ some use cases are already supported by kubernetes and should be migrated. If your tool can migrate, please do plan a migration. Reach out if you need help or find you are blocked by missing features. Most of all, WMCS is here to support you.

However, it’s possible your tool needs a mixed runtime environment or some other features that aren't yet present in https://techblog.wikimedia.org/2022/03/18/toolforge-jobs-framework/. We’d love to hear of this or any other blocking issues so we can work with you once a migration path is ready. Thanks for your hard work as volunteers and help in this migration!

My apologies if this ticket comes as a surprise to you. In order to ensure WMCS can provide a stable, secure and supported platform, it’s important we migrate away from GridEngine. I want to assure you that while it is WMCS’s intention to shutdown GridEngine as outlined in the blog post https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/, a shutdown date for GridEngine has not yet been set. The goal of the migration is to migrate as many tools as possible onto kubernetes and ensure as smooth a transition as possible for everyone. Once the majority of tools have migrated, discussion on a shutdown date is more appropriate. See T314664: [infra] Decommission the Grid Engine infrastructure.

Hi! My tool is written in dotnet and requires donet 6 which you don't have an image for. So it is not possible to migrate at the moment.

My apologies if this ticket comes as a surprise to you. In order to ensure WMCS can provide a stable, secure and supported platform, it’s important we migrate away from GridEngine. I want to assure you that while it is WMCS’s intention to shutdown GridEngine as outlined in the blog post https://techblog.wikimedia.org/2022/03/14/toolforge-and-grid-engine/, a shutdown date for GridEngine has not yet been set. The goal of the migration is to migrate as many tools as possible onto kubernetes and ensure as smooth a transition as possible for everyone. Once the majority of tools have migrated, discussion on a shutdown date is more appropriate. See T314664: [infra] Decommission the Grid Engine infrastructure.

Hi! My tool is written in dotnet and requires donet 6 which you don't have an image for. So it is not possible to migrate at the moment.

This is being addressed here T311466: Create a kubernetes container with mono and dotnet

This is being addressed here T311466: Create a kubernetes container with mono and dotnet

That ticket is about mono which is not ideal. Ideally we want a modern version of dotnet (.NET 6). The image is already available from MS repo: https://hub.docker.com/_/microsoft-dotnet-runtime/

Is there any way to use the MS-supplied image on labs k8s?

Is there any way to use the MS-supplied image on labs k8s?

At this time we do not allow "bring your own container" nor do we import containers created externally for use by Toolforge tools.

This is being addressed here T311466: Create a kubernetes container with mono and dotnet

That ticket is about mono which is not ideal. Ideally we want a modern version of dotnet (.NET 6). The image is already available from MS repo: https://hub.docker.com/_/microsoft-dotnet-runtime/

It looks to me like ~chie-bot/run-bot uses the mono runtime today on the grid engine. Is there a particular reason that you would need a different runtime to move these jobs from the grid engine to a Kuberrnetes container?

At this time we do not allow "bring your own container" nor do we import containers created externally for use by Toolforge tools.
It looks to me like ~chie-bot/run-bot uses the mono runtime today on the grid engine. Is there a particular reason that you would need a different runtime to move these jobs from the grid engine to a Kuberrnetes container?

Mono is essentially a dying legacy technology (https://github.com/mono/mono/issues/20931#issuecomment-805049183) and, apart from embedded applications requiring AOT (i.e. Xamarin), is only there to maintain support for running old .NET4 apps on Linux. The way forward is to port apps to .NET6.

My tool was using mono because there was no choice years ago. Now it has arrived, and everybody should be migrating to .NET6. I know it's all confusing, here's a good but slightly outdated historical write-up if you are interested https://stackoverflow.com/questions/62905814/net-5-and-mono

Mono is essentially a dying legacy technology (https://github.com/mono/mono/issues/20931#issuecomment-805049183) and, apart from embedded applications requiring AOT (i.e. Xamarin), is only there to maintain support for running old .NET4 apps on Linux. The way forward is to port apps to .NET6.

My tool was using mono because there was no choice years ago. Now it has arrived, and everybody should be migrating to .NET6. I know it's all confusing, here's a good but slightly outdated historical write-up if you are interested https://stackoverflow.com/questions/62905814/net-5-and-mono

This is all useful information, but it is not obvious how a completely new runtime system is a required feature addition for Toolforge in order for existing tools to migrate from one distributed computing technology to another. You are asking for net new functionality rather than a correction of a regression of features that were available on an older platform. I understand that you would like to have a newer and different runtime, but statements being made in this ticket about your preference being a requirement seem disingenuous.

@Leloiandudu I added a comment about .net6 support to T311466: Create a kubernetes container with mono and dotnet. I'm curious about the same question I asked there for this tool. Does that mono container not work for this tool? What issues do you find in trying to run? If so, what's different about it versus the grid?

@bd808 hi Bryan. I understand that you're worried about scope creep here. Sorry about that. My initial thought was, since we're moving to the new cloud platform, it would make sense to take that opportunity and get the modern runtime.

Mono has been a source of headache for many years now and I've only been using it because there were no alternatives available at the time. It's very unreliable and keep crashing (core dumping) my tools or putting them in unresponsive state from time to time (I even have a cron job to check if my web-based fountain tool is alive and restart it if it's 500ing). Let alone the complete mess that SSL support on mono is.

New .NET6 runtime is reliable, small, much more performant and has a significantly smaller memory footprint. In most cases the existing tools are console-based, so they can get all that nearly for free by following a simple migration process. So my assumption was that the mono container was requested by tools that cannot / don't wish to upgrade right now. If that's not the case, I'm fine to temporarily keep using mono, but please keep in mind that we will have to get a modern version of .NET sooner rather than later as this is what new tools would want to use (kind of similar to python 2/3 problem or new java versions)

@nskaggs see above

@bd808 hi Bryan. I understand that you're worried about scope creep here. Sorry about that. My initial thought was, since we're moving to the new cloud platform, it would make sense to take that opportunity and get the modern runtime.

Mono has been a source of headache for many years now and I've only been using it because there were no alternatives available at the time. It's very unreliable and keep crashing (core dumping) my tools or putting them in unresponsive state from time to time (I even have a cron job to check if my web-based fountain tool is alive and restart it if it's 500ing). Let alone the complete mess that SSL support on mono is.

New .NET6 runtime is reliable, small, much more performant and has a significantly smaller memory footprint. In most cases the existing tools are console-based, so they can get all that nearly for free by following a simple migration process. So my assumption was that the mono container was requested by tools that cannot / don't wish to upgrade right now. If that's not the case, I'm fine to temporarily keep using mono, but please keep in mind that we will have to get a modern version of .NET sooner rather than later as this is what new tools would want to use (kind of similar to python 2/3 problem or new java versions)

@nskaggs see above

We recently added .NET support for the build service (https://wikitech.wikimedia.org/wiki/Help:Toolforge/Build_Service#.NET), this brings .NET 8 as a default, would that work for you?
If you give it a try, let me know how it goes, or if you have any issues/bugs I can try to help.
Cheers!

Hi @Leloiandudu, just a gentle reminder that there's still a webservice running in the grid, let me know if you are having more issues or need some help.

This tool has been disabled from running on the Grid.

If you are the maintainer and you want this re-enabled so that you can work on migrating it off the grid, please reach out to the cloud admins on the mailing list(cloud-admin@lists.wikimedia.org)

However, please note that according to the timeline shared, on the 14th of March 2024, the grid infrastructure will be shut down.

taavi subscribed.

The grid engine has been shut down, so I'm closing any remaining migration tasks as Declined. If you're still planning to migrate this tool, please re-open this task and add one or more active project tags to it. (If you need a project tag for your tool, those can be created via the Toolforge admin console.)