Page MenuHomePhabricator

Move MediaWiki jobs to mw-on-k8s
Closed, ResolvedPublic

Description

We want to move all jobrunning to mediawiki on k8s, eventually.

We need to do the following:

  • Configure the mw-jobrunner deployment to use a similar php configuration to the production jobrunners
  • Move jobs, first one by one, in the configuration of changeprop to be pointed to the new deployment.

We will probably need to repurpose some servers to be k8s nodes when we move a fraction near to 100% of jobs to mw on k8s.

Use https://grafana.wikimedia.org/goto/TSO0UxHIz?orgId=1 to estimate the impact of moving a particular job.

Details

SubjectRepoBranchLines +/-
operations/mediawiki-configmaster+1 -1
operations/puppetproduction+2 -0
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+5 -40
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+2 -2
operations/deployment-chartsmaster+2 -2
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+2 -2
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+5 -0
operations/puppetproduction+0 -0
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+1 -0
operations/deployment-chartsmaster+10 -0
operations/deployment-chartsmaster+0 -1
operations/deployment-chartsmaster+4 -0
operations/deployment-chartsmaster+2 -0
operations/deployment-chartsmaster+1 -0
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+1 -0
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+1 -0
operations/deployment-chartsmaster+1 -0
operations/deployment-chartsmaster+1 -0
operations/deployment-chartsmaster+4 -0
operations/deployment-chartsmaster+5 -0
operations/deployment-chartsmaster+2 -0
operations/deployment-chartsmaster+1 -5
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+5 -0
operations/deployment-chartsmaster+15 -4
operations/puppetproduction+1 -7
operations/deployment-chartsmaster+1 -1
operations/dnsmaster+2 -0
operations/puppetproduction+2 -0
operations/deployment-chartsmaster+2 -3
operations/puppetproduction+2 -2
operations/puppetproduction+1 -1
operations/dnsmaster+1 -1
operations/puppetproduction+36 -0
operations/dnsmaster+4 -0
operations/deployment-chartsmaster+30 -0
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 977210 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] jobqueue: migrate thumbnailrender to k8s

https://gerrit.wikimedia.org/r/977210

Change 977209 merged by jenkins-bot:

[operations/deployment-charts@master] mw-jobrunner: increase replicas

https://gerrit.wikimedia.org/r/977209

Change 977210 merged by jenkins-bot:

[operations/deployment-charts@master] jobqueue: migrate thumbnailrender to k8s

https://gerrit.wikimedia.org/r/977210

Change 978032 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changejob-jobqueue: move two more jobs to k8s

https://gerrit.wikimedia.org/r/978032

Change 978032 merged by jenkins-bot:

[operations/deployment-charts@master] changejob-jobqueue: move two more jobs to k8s

https://gerrit.wikimedia.org/r/978032

Change 978500 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] mw-jobrunner: increase replicas

https://gerrit.wikimedia.org/r/978500

Change 978630 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] jobqueue: move move jobs to k8s jobrunner

https://gerrit.wikimedia.org/r/978630

Change 978630 merged by jenkins-bot:

[operations/deployment-charts@master] jobqueue: move move jobs to k8s jobrunner

https://gerrit.wikimedia.org/r/978630

Change 979051 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changeprop-jobqueue: migrate more lower-impact jobs

https://gerrit.wikimedia.org/r/979051

Change 979051 merged by jenkins-bot:

[operations/deployment-charts@master] changeprop-jobqueue: migrate more lower-impact jobs

https://gerrit.wikimedia.org/r/979051

Change 979122 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changeprop-jobqueue: toggle another job

https://gerrit.wikimedia.org/r/979122

Change 979122 merged by jenkins-bot:

[operations/deployment-charts@master] changeprop-jobqueue: toggle another job

https://gerrit.wikimedia.org/r/979122

Change 979365 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] jobqueue: migrate another medium-weight job

https://gerrit.wikimedia.org/r/979365

Change 979365 merged by jenkins-bot:

[operations/deployment-charts@master] jobqueue: migrate another medium-weight job

https://gerrit.wikimedia.org/r/979365

Change 979395 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] jobqueue: switch a medium weight job

https://gerrit.wikimedia.org/r/979395

Change 979396 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] jobqueue: migrate a moderately weighty job

https://gerrit.wikimedia.org/r/979396

Change 979397 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] jobqueue: migrate a heavyweight job to jobqueue

https://gerrit.wikimedia.org/r/979397

Change 979395 merged by jenkins-bot:

[operations/deployment-charts@master] jobqueue: switch a medium weight job

https://gerrit.wikimedia.org/r/979395

Change 980011 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] mw-jobrunner: increase replicas

https://gerrit.wikimedia.org/r/980011

Change 980011 merged by jenkins-bot:

[operations/deployment-charts@master] mw-jobrunner: increase replicas

https://gerrit.wikimedia.org/r/980011

Change 979396 merged by jenkins-bot:

[operations/deployment-charts@master] jobqueue: migrate a moderately weighty job

https://gerrit.wikimedia.org/r/979396

Change 980369 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] mw-jobrunner: bump replicas further

https://gerrit.wikimedia.org/r/980369

Change 980369 merged by jenkins-bot:

[operations/deployment-charts@master] mw-jobrunner: bump replicas further

https://gerrit.wikimedia.org/r/980369

Change 979397 merged by jenkins-bot:

[operations/deployment-charts@master] jobqueue: migrate a heavyweight job to jobqueue

https://gerrit.wikimedia.org/r/979397

Change 980433 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] jobqueue: migrate a larger job (and one smaller one)

https://gerrit.wikimedia.org/r/980433

Change 980433 merged by jenkins-bot:

[operations/deployment-charts@master] jobqueue: migrate a larger job (and one smaller one)

https://gerrit.wikimedia.org/r/980433

Change 980849 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changeprop-jobqueue: migrate one large and a few small jobs

https://gerrit.wikimedia.org/r/980849

Change 980849 merged by jenkins-bot:

[operations/deployment-charts@master] changeprop-jobqueue: migrate one large and a few small jobs

https://gerrit.wikimedia.org/r/980849

Change 980853 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changeprop-jobqueue: restore cirrussearchlinksupdate to metal

https://gerrit.wikimedia.org/r/980853

Change 980853 merged by jenkins-bot:

[operations/deployment-charts@master] changeprop-jobqueue: restore cirrussearchlinksupdate to metal

https://gerrit.wikimedia.org/r/980853

Change 980855 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changeprop-jobqueue: migrate all remaining small jobs, also cdnPurge

https://gerrit.wikimedia.org/r/980855

Change 980856 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changeprop-jobqueue: migrate all low-traffic jobs

https://gerrit.wikimedia.org/r/980856

Change 980855 merged by jenkins-bot:

[operations/deployment-charts@master] changeprop-jobqueue: migrate all remaining small jobs, also cdnPurge

https://gerrit.wikimedia.org/r/980855

Change 980856 merged by jenkins-bot:

[operations/deployment-charts@master] changeprop-jobqueue: migrate all low-traffic jobs

https://gerrit.wikimedia.org/r/980856

hnowlan changed the task status from Open to In Progress.Dec 6 2023, 4:27 PM
hnowlan claimed this task.
hnowlan subscribed.

Change 978500 abandoned by Hnowlan:

[operations/deployment-charts@master] mw-jobrunner: increase replicas

Reason:

Scaled up elsewhere, waiting for next jobs and more hw

https://gerrit.wikimedia.org/r/978500

Change 973824 abandoned by Hnowlan:

[operations/puppet@production] kubernetes::worker: add mw-jobrunner to pools

Reason:

Done already

https://gerrit.wikimedia.org/r/973824

Change 983216 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changeprop-jobqueue: move PublishStashedFile back to non-k8s jobrunner

https://gerrit.wikimedia.org/r/983216

Change 983216 abandoned by Hnowlan:

[operations/deployment-charts@master] changeprop-jobqueue: move PublishStashedFile back to non-k8s jobrunner

Reason:

Already implemented

https://gerrit.wikimedia.org/r/983216

Change 989131 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] mw-jobrunner: increase replicas

https://gerrit.wikimedia.org/r/989131

Change 989133 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changeprop-jobqueue: move cirrusSearchElasticaWrite

https://gerrit.wikimedia.org/r/989133

Change 989131 merged by jenkins-bot:

[operations/deployment-charts@master] mw-jobrunner: increase replicas

https://gerrit.wikimedia.org/r/989131

Change 989133 merged by jenkins-bot:

[operations/deployment-charts@master] changeprop-jobqueue: move cirrusSearchElasticaWrite

https://gerrit.wikimedia.org/r/989133

Change 989488 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] mw-jobrunner: increase replicas for parsoidCachePrewarm

https://gerrit.wikimedia.org/r/989488

Change 989489 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changeprop-jobqueue: migrate parsoidCachePrewarm to k8s

https://gerrit.wikimedia.org/r/989489

Change 989488 merged by jenkins-bot:

[operations/deployment-charts@master] mw-jobrunner: increase replicas for parsoidCachePrewarm

https://gerrit.wikimedia.org/r/989488

Change 989489 merged by jenkins-bot:

[operations/deployment-charts@master] changeprop-jobqueue: migrate parsoidCachePrewarm to k8s

https://gerrit.wikimedia.org/r/989489

Change 991377 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changeprop-jobqueue: disable ThumbnailRender on k8s

https://gerrit.wikimedia.org/r/991377

Change 991377 merged by jenkins-bot:

[operations/deployment-charts@master] changeprop-jobqueue: disable ThumbnailRender on k8s

https://gerrit.wikimedia.org/r/991377

Change 1003499 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] mw-jobrunner: bump replicas for cirrusSearchLinksUpdate

https://gerrit.wikimedia.org/r/1003499

Change 1003499 merged by jenkins-bot:

[operations/deployment-charts@master] mw-jobrunner: bump replicas for cirrusSearchLinksUpdate

https://gerrit.wikimedia.org/r/1003499

Mentioned in SAL (#wikimedia-operations) [2024-02-15T11:57:13Z] <cgoubert@deploy2002> Finished scap: Deploying mw-on-k8s 1003499 1003393 - T349796 T357507 (duration: 00m 50s)

Change 1003617 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] jobqueue: migrate cirrusSearchLinksUpdate to k8s

https://gerrit.wikimedia.org/r/1003617

Change 1003617 merged by jenkins-bot:

[operations/deployment-charts@master] jobqueue: migrate cirrusSearchLinksUpdate to k8s

https://gerrit.wikimedia.org/r/1003617

Change 1003772 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] mw-jobrunner: bump replicas

https://gerrit.wikimedia.org/r/1003772

Change 1003772 merged by jenkins-bot:

[operations/deployment-charts@master] mw-jobrunner: bump replicas

https://gerrit.wikimedia.org/r/1003772

Change 1004062 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] mw-jobrunner: bump replicas in order to migrate refreshLinks

https://gerrit.wikimedia.org/r/1004062

Change 1004063 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changeprop-jobqueue: migrate refreshLinks to k8s

https://gerrit.wikimedia.org/r/1004063

Change 1004066 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] changeprop: clean up k8s jobrunner references

https://gerrit.wikimedia.org/r/1004066

Change 1004062 merged by jenkins-bot:

[operations/deployment-charts@master] mw-jobrunner: bump replicas in order to migrate refreshLinks

https://gerrit.wikimedia.org/r/1004062

Change 1004063 merged by jenkins-bot:

[operations/deployment-charts@master] changeprop-jobqueue: migrate refreshLinks to k8s

https://gerrit.wikimedia.org/r/1004063

Change 1004662 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] mw-jobrunner: begin to scale down replicas

https://gerrit.wikimedia.org/r/1004662

Change 1004662 merged by jenkins-bot:

[operations/deployment-charts@master] mw-jobrunner: begin to scale down replicas

https://gerrit.wikimedia.org/r/1004662

hnowlan updated the task description. (Show Details)

Change 1005121 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] mw-jobrunner: reduce replicas further

https://gerrit.wikimedia.org/r/1005121

Change 1004066 merged by jenkins-bot:

[operations/deployment-charts@master] changeprop: clean up k8s jobrunner references

https://gerrit.wikimedia.org/r/1004066

All (non-videoscaler) jobs migrated to Kubernetes jobrunners. Videoscaler work tracked in T355292

Change #1005121 abandoned by Hnowlan:

[operations/deployment-charts@master] mw-jobrunner: reduce replicas further

Reason:

Decided against

https://gerrit.wikimedia.org/r/1005121

Change #1018420 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/puppet@production] service catalog: disable paging on jobrunner and videoscaler services

https://gerrit.wikimedia.org/r/1018420

Change #1018420 merged by Cwhite:

[operations/puppet@production] service catalog: disable paging on jobrunner and videoscaler services

https://gerrit.wikimedia.org/r/1018420

Change #1021886 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/mediawiki-config@master] CommonSettings.php: Fix jobrunner hostname

https://gerrit.wikimedia.org/r/1021886

Change #1021886 abandoned by Hashar:

[operations/mediawiki-config@master] CommonSettings.php: Fix jobrunner hostname

Reason:

T129982 is from 8 years ago and T349796 is the task to migrate to Kubernetes. The xff log spam is partially fixed by https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1020277 and has a pending follow up patch https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1025391

https://gerrit.wikimedia.org/r/1021886