Page MenuHomePhabricator

Migrate all jobs to labs slaves
Closed, ResolvedPublic

Description

The architecture for the disposable VMs (T47499) will have Zuul mergers in the labs infrastructure. None of the production slave (gallium, lanthanum) will be able to reach them due to security constraints.

Thus we need to migrate all jobs tied to productionSlaves label to labs instance (contintLabSlaves).

Tied to gallium

To keep on gallium:

  • integration-docroot-deploy Keep, needs gallium for web server
  • publish-doc Keep, needs gallium for web server

To be moved:

  • jsduck - T86175
    • mediawiki-core-jsduck
    • mediawiki-core-jsduck-publish
    • oojs-core-jsduck
    • oojs-core-jsduck-publish
    • oojs-ui-jsduck
    • oojs-ui-jsduck-publish
    • unicodejs-jsduck
    • unicodejs-jsduck-publish - https://gerrit.wikimedia.org/r/187798
    • VisualEditor-jsduck
    • mwext-GuidedTour-jsduck
    • mwext-GuidedTour-jsduck-publish
    • mwext-MultimediaViewer-jsduck
    • mwext-MultimediaViewer-jsduck-publish
    • mwext-VisualEditor-jsduck
    • mwext-VisualEditor-jsduck-publish

Tied to production slaves (gallium or lanthanum):

  • jobs which have no node: stanza in JJB configuration file. Important

Jobs roaming on any slaves

ssh gallium.wikimedia.org "grep -l '<canRoam>true</canRoam>'  /var/lib/jenkins/jobs/*/config.xml | cut -d\/ -f6"

DONE as of July 3rd 2015.

Search jobs: https://gerrit.wikimedia.org/r/217507

DONE

  • search-extra
  • search-extra-javadoc
  • search-extra-javadoc-publish
  • search-highlighter
  • search-repository-swift

Related Objects

StatusSubtypeAssignedTask
Resolvedhashar
Resolvedhashar
InvalidRyasmeen
DeclinedNone
DeclinedNone
Resolvedhashar
Resolvedhashar
Resolvedhashar
Resolvedhashar
ResolvedKrinkle
Resolvedhashar
ResolvedKrinkle
ResolvedKrinkle
ResolvedKrinkle
Resolvedhashar
ResolvedKrinkle
Resolvedhashar
Resolvedhashar
Resolvedhashar
Resolvedhashar

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
hashar triaged this task as Medium priority.Feb 10 2015, 8:57 PM
Krinkle renamed this task from Migrate all jobs depending on Zuul git repos out of production slaves to Migrate all jobs to labs slaves.Mar 3 2015, 11:04 PM
Krinkle updated the task description. (Show Details)
Krinkle updated the task description. (Show Details)

Change 198655 had a related patch set uploaded (by Krinkle):
Migrate mobile-qunit jobs to new CI slaves in labs

https://gerrit.wikimedia.org/r/198655

Change 198655 merged by jenkins-bot:
Migrate mobile-qunit jobs to new CI slaves in labs

https://gerrit.wikimedia.org/r/198655

Change 204706 had a related patch set uploaded (by Legoktm):
Use generic phpunit job for operations/mediawiki-config

https://gerrit.wikimedia.org/r/204706

Change 204707 had a related patch set uploaded (by Legoktm):
Pin generic 'phpunit' job to labs slaves

https://gerrit.wikimedia.org/r/204707

Change 204706 merged by jenkins-bot:
Use generic phpunit job for operations/mediawiki-config

https://gerrit.wikimedia.org/r/204706

Change 204707 merged by jenkins-bot:
Pin generic 'phpunit' job to labs slaves

https://gerrit.wikimedia.org/r/204707

Change 204980 had a related patch set uploaded (by Legoktm):
Convert 'mediawiki-vagrant-puppet-doc' job to run on a labs slave

https://gerrit.wikimedia.org/r/204980

Change 204982 had a related patch set uploaded (by Legoktm):
Convert 'operations-puppet-doc' job to run on a labs slave

https://gerrit.wikimedia.org/r/204982

Change 204980 merged by jenkins-bot:
Convert 'mediawiki-vagrant-puppet-doc' job to run on a labs slave

https://gerrit.wikimedia.org/r/204980

I have added some jobs that do not have any node: and hence roam on any of our slaves (the Jenkins XML has: canRoam>true</canRoam>).

Change 217507 had a related patch set uploaded (by Hashar):
Tie search* jobs to labs and Trusty

https://gerrit.wikimedia.org/r/217507

Change 217507 merged by jenkins-bot:
Tie search* jobs to labs and Trusty

https://gerrit.wikimedia.org/r/217507

Change 217508 had a related patch set uploaded (by Hashar):
Phase out wikimedia/bugzilla/wikibugs

https://gerrit.wikimedia.org/r/217508

Change 217508 merged by jenkins-bot:
Phase out wikimedia/bugzilla/wikibugs

https://gerrit.wikimedia.org/r/217508

Change 217509 had a related patch set uploaded (by Hashar):
Migrate *gembuild to Precise labs

https://gerrit.wikimedia.org/r/217509

Change 217509 merged by jenkins-bot:
Migrate *gembuild to Precise labs

https://gerrit.wikimedia.org/r/217509

I believe nothing will run on lanthanum.eqiad.wmnet anymore. What is left to migrate are the jobs currently running on gallium. Some might be migrated to labs, some not.

hashar lowered the priority of this task from Medium to Low.Sep 14 2015, 8:51 AM
hashar removed a project: Patch-For-Review.

Almost completed. There is still a few jobs on gallium.wikimedia.org though.

Change 204982 merged by jenkins-bot:
Convert 'operations-puppet-doc' job to run on a labs slave

https://gerrit.wikimedia.org/r/204982

Status as October 28th 2015

$ ssh gallium.wikimedia.org ls -1 /srv/ssd/jenkins-slave/workspace
integration-docroot-deploy
mediawiki-vagrant-puppet-doc
mwext-VisualEditor-sync-gerrit
publish-on-gallium

I suppose we could turn the integration-docroot-deploy into a rsync job like the doc ones? Is it worth it?

I'm guessing mediawiki-vagrant-puppet-doc is straightforward, similar to the ops puppet one.

Is it even possible to move publish-on-gallium onto labs?

Stashbot subscribed.

Mentioned in SAL [2016-09-08T10:02:05Z] <hashar> Delete mwext-VisualEditor-sync-gerrit job, already got removed by ostriches in 139d17c8f1c4bcf2bb761e13a6501e4d85684066 . The issue in Gerrit (T51846) has been fixed. Poke T86659 , one less job on slaves.

Mentioned in SAL [2016-09-08T10:03:29Z] <hashar> Delete Jenkins job https://integration.wikimedia.org/ci/job/mwext-VisualEditor-sync-gerrit/ that has been left behind. It is no more needed. T51846 T86659

This is almost finished we just need to do

jobs which have no node: stanza in JJB configuration file. Important

now.

Mentioned in SAL (#wikimedia-releng) [2016-11-24T20:49:59Z] <hashar> make contint1001 Jenkins slave to only builds jobs with a label matching the node https://integration.wikimedia.org/ci/computer/contint1001/configure T86659

hashar claimed this task.

Paladox pointed:

jobs which have no node: stanza in JJB configuration file. Important

I have changed contint1001 slave to:

Only build jobs with label restrictions matching this node

In this mode, Jenkins will only build a project on this node when that project is restricted to certain nodes using a label expression, and that expression matches this node's name and/or labels. This allows a slave to be reserved for certain kinds of jobs. For example, to run performance tests continuously from Jenkins, you can use this setting with # of executors as 1, so that only one performance test runs at any given time, and that one executor won't be blocked by other builds that can be done on other nodes.

This way it only runs jobs that have node: contint1001 :]

The rest of jobs running on the production machine contint1001 are related to publish artifacts for doc.wikimedia.org. They will be moved to another host eventually via another task (cant find it).