As part of the switchover, we stop and then restart maintenance scripts, directly on the maintenance hosts. With maintenance scripts moving to Kubernetes, we'll need to update the cookbooks.
Before the March 2024 switch:
- Update 01-stop-maintenance.py to delete all Jobs running in the "mw-script" namespace in from_dc.
That's all we need now, because there are no periodic maintenance scripts yet; only the ones started manually. We should make sure none of those are still running when the read-only phase starts, but we don't need to make any changes to periodic jobs (which are all still on mwmaint), or restart anything in to_dc.
Before the September 2024 switch:
- Update 01-stop-maintenance.py to wait for Jobs to terminate after the delete API call.
- Update 01-stop-maintenance.py to disable cronjobs in from_dc. (Pre-k8s, we just kill the actively-running processes and then hustle to start the next step before the timers restart anything. That works fine, but as long as we're redesigning this, we can freeze the crons too.)
- Update 08-start-maintenance.py to enable cronjobs in to_dc.
- After we add a way for maintenance scripts to be marked as idempotent, update 08-start-maintenance.py to restart in to_dc any idempotent script that was killed in from_dc.