Page MenuHomePhabricator

contint2002 service implementation tracking
Closed, ResolvedPublic

Description

This is to track the service implement of serviceops host contint2002 which is the primary CI server with Jenkins/Zuul etc. It is done independently from contint1002 which is simpler (T313832).

topic branch https://gerrit.wikimedia.org/r/q/topic:contint2002
change to this task https://gerrit.wikimedia.org/r/q/bug:T324659

migration checklist

ahead of maintenance

Jenkins:

  • Allow contint2002 to ssh/port 22 and rsync/port 873 to integration instances via Horizon security group
  • Allow contint2002 to ssh/port 22 to puppet-diffs
  • Allow contint2002 to ssh/port 22 to deployment-prep
  • Add contint2002 as a Jenkins agent and put it offline

Move zuul-merger from old to new host

  • Merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/936266
  • Run puppet on old host (contint2001) to disable zuul-merger
  • Run puppet on new host (contint2002) to enable zuul-merger
  • Verify service
  • zuul-merger stopped on contint2001, started on contint2002. It is reflected on the Zuul server at Gearman level (zuul-gearman.py workers|grep merger). CI builds are already using the new instance :)

synchronize build artifacts

  • sudo rsync -ap --whole-file --delete-delay --info=progress2 /srv/jenkins/ rsync://contint2002.wikimedia.org/ci--srv-jenkins-
  • sudo rsync -ap --whole-file --delete-delay --info=progress2 /var/lib/jenkins/ rsync://contint2002.wikimedia.org/ci--var-lib-jenkins-
  • sudo rsync -ap --whole-file --delete-delay --info=progress2 /var/lib/zuul/ rsync://contint2002.wikimedia.org/ci--var-lib-zuul-

Stop all services

  • downtime both hosts contint2001 contint2002
    • sudo cookbook sre.hosts.downtime -r "Switch contint hosts for hardware replacement" -t T324659 -M 60 contint2001.wikimedia.org
    • sudo cookbook sre.hosts.downtime -r "Switch contint hosts for hardware replacement" -t T324659 -M 60 contint2002.wikimedia.org
  • stop puppet on both hosts sudo disable-puppet "Switch contint hosts for hardware replacement - T324659"
  • sudo systemctl stop jenkins and sudo systemctl stop zuul on contint2001
  • Verify Jenkins and Zuul are stopped/masked on contint2002

rsync data and states

Now that services are stopped, resynchronize all artifacts and states:

  • sudo rsync -ap --whole-file --delete-delay --info=progress2 /srv/jenkins/ rsync://contint2002.wikimedia.org/ci--srv-jenkins-
  • sudo rsync -ap --whole-file --delete-delay --info=progress2 /var/lib/jenkins/ rsync://contint2002.wikimedia.org/ci--var-lib-jenkins-
  • sudo rsync -ap --whole-file --delete-delay --info=progress2 /var/lib/zuul/ rsync://contint2002.wikimedia.org/ci--var-lib-zuul-

change DNS

change primary in Puppet / Hiera

Start services

  • step we missed Update the Zuul configuration in /etc/zuul/wikimedia: ./fab deploy_zuul
  • enable Puppet on new host sudo enable-puppet "Switch contint hosts for hardware replacement - T324659"
  • run Puppet agent on new host (which should apply config changes and bring up both Jenkins and Zuul)
  • verify Zuul
  • verify Jenkins

Reflect stopped services on old host

  • enable puppet again on old host sudo enable-puppet "Switch contint hosts for hardware replacement - T324659"
  • run Puppet agent on old host (stop/disable/mask Jenkins, Zuul)

After maintenance

208.80.153.15

Details

SubjectRepoBranchLines +/-
integration/configmaster+7 -7
integration/configmaster+3 -3
operations/puppetproduction+1 -2
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/dnsmaster+1 -1
operations/puppetproduction+2 -1
operations/puppetproduction+2 -0
operations/puppetproduction+2 -0
operations/puppetproduction+34 -0
operations/puppetproduction+1 -1
operations/puppetproduction+53 -67
operations/puppetproduction+14 -14
operations/puppetproduction+181 -172
operations/puppetproduction+3 -0
labs/privatemaster+16 -0
operations/puppetproduction+17 -2
operations/puppetproduction+16 -10
operations/puppetproduction+9 -1
operations/puppetproduction+0 -4
operations/puppetproduction+32 -33
operations/puppetproduction+1 -1
operations/puppetproduction+3 -1
operations/puppetproduction+2 -2
operations/puppetproduction+1 -0
operations/puppetproduction+8 -0
operations/puppetproduction+1 -0
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 907898 had a related patch set uploaded (by Hashar; author: Hashar):

[labs/private@master] ci: add secreats for ci::manager and ci::worker roles

https://gerrit.wikimedia.org/r/907898

Change 907898 merged by Hashar:

[labs/private@master] ci: add secreats for ci::manager and ci::worker roles

https://gerrit.wikimedia.org/r/907898

Change 867670 abandoned by Hashar:

[operations/puppet@production] scap: add contint2002 to ci-docroot, jenkins, zuul deploy

Reason:

I have made the list of scap target to be populated from the Puppet DB with https://gerrit.wikimedia.org/r/c/operations/puppet/+/893483/ .

This change triggered me to get the pending change deployed and verified (I ran dummy deployment for the new hosts).

https://gerrit.wikimedia.org/r/867670

Change 908232 had a related patch set uploaded (by Jbond; author: Hashar):

[operations/puppet@production] ci: indicate which server is the control server via a hiera param

https://gerrit.wikimedia.org/r/908232

Change 907886 abandoned by Hashar:

[operations/puppet@production] ci: split contint hosts to different roles

Reason:

Abandoning in favor of a single role for all hosts and using a hiera setting to define which one is the primary: https://gerrit.wikimedia.org/r/c/operations/puppet/+/908232/

https://gerrit.wikimedia.org/r/907886

Change 907885 merged by Jbond:

[operations/puppet@production] ci: rename ci::master role to ci

https://gerrit.wikimedia.org/r/907885

Change 908232 merged by Jbond:

[operations/puppet@production] ci: indicate which server is the control server via a hiera param

https://gerrit.wikimedia.org/r/908232

The process to migrate involves a rsync running a chroot which is thus unable to do user/group id mapping between the host. Short of removing the use_chroot clause, all users and groups need id to be reserved in Puppet.

The transferred directories are:

hieradata/role/common/ci.yaml
profile::ci::migration::rsync_data_dirs:
  - "/var/lib/jenkins/"
  - "/var/lib/zuul/"
  - "/srv/"

On the current primary I went with a brute force approach to find all used user/groups:

sudo find /var/lib/jenkins /var/lib/zuul /srv -printf '| %u | %U | %g | %G\n'|uniq > uid_gid.txt

Which gives me:

UserUID
6553365533
900900
_apt100
brennen20958
dancy25006
deploy-ci-docroot492
deploy-jenkins491
deploy-service494
deploy-zuul493
hashar1010
jenkins499
jenkins-slave498
jforrester2417
root0
slyngshede39083
thcipriani11634
zuul497
GroupGID
6553365533
900900
adm4
bacula118
deploy-ci-docroot494
deploy-jenkins493
deploy-service496
deploy-zuul495
jenkins499
jenkins-slave1001
mail8
nogroup65534
root0
shadow42
staff50
tty5
ulog119
utmp43
wikidev500
zuul498

Some are human users that have uid reserved via modules/admin/data/data.yaml. The deploy-* users are created by Puppet for the scap::target. I guess it is sufficient to NOT rsync /srv/deployment.
There are unknown such as _apt (100), 65533, 900 which apparently come from /srv/docker which we do not need to rsync (the images will be downloaded from the Docker registry when they are missing).

Further inspecting what is in /srv we only need to rsync /srv/jenkins which has:

sudo find /srv/jenkins -printf '| %u | %U | %g | %G\n'|uniq|sort|uniq

UserUIDGroupGID
jenkins499jenkins499
root0root0

And for /var/lib/jenkins and /var/lib/zuul:

sudo find /var/lib/jenkins /var/lib/zuul -printf '| %u | %U | %g | %G\n'|uniq|sort|uniq

UserUIDGroupGID
jenkins499adm4
jenkins499jenkins499
jenkins499nogroup65534
root0bacula118
root0jenkins499
root0root0
zuul497zuul498

The adm group comes from /var/lib/jenkins/logs and is set to Puppet. Debian might have hardcoded it to be GID=4.

/var/lib/jenkins/.gitignore is owned by root:bacula which I believe is from a restore from backup we did when gallium had a disk crash. We no more use git to manage that directory so I went ahead and deleted it contint2001 (it was not on the others).

We thus need to reserve UID/GID for jenkins and zuul then migrate the files to the new UID (which will take a while since there are a lot of files owned by jenkins under /srv/jenkins).


The alternative is to set use chroot = no (which implies numeric ids = no which lets rsync do the name mapping).

Change 917908 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/puppet@production] ci: in /srv only migrate /srv/jenkins

https://gerrit.wikimedia.org/r/917908

+ @jnuche who co manages our Jenkins nowadays. This task is to migrate the Jenkins/Zuul/integration website services from contint2001.wikimedia.org to contint2002.wikimedia.org.

Change 917916 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/puppet@production] admin: reserve jenkins and zuul uid/gid

https://gerrit.wikimedia.org/r/917916

Change 917918 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/puppet@production] zuul: switch to fixed uid/gid 923

https://gerrit.wikimedia.org/r/917918

Change 917919 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/puppet@production] jenkins: switch to fixed uid/gid 924

https://gerrit.wikimedia.org/r/917919

Change 917908 merged by Dzahn:

[operations/puppet@production] ci: in /srv only migrate /srv/jenkins

https://gerrit.wikimedia.org/r/917908

Change 917916 merged by Dzahn:

[operations/puppet@production] admin: reserve jenkins and zuul uid/gid

https://gerrit.wikimedia.org/r/917916

Change 917918 merged by Dzahn:

[operations/puppet@production] zuul: switch to fixed uid/gid 923

https://gerrit.wikimedia.org/r/917918

after deploying the change above carefully on all 3 contint* servers, stopping services, running manual chown commands , starting services, verifying etc.. now the zuul user is using uid:gid 923:923 instead of the old 497:498. this is good for several reasons.

[contint2002:~] $ id zuul
uid=923(zuul) gid=923(zuul) groups=923(zuul)

[contint2001:~] $ id zuul
uid=923(zuul) gid=923(zuul) groups=923(zuul)

[contint2001:~] $ id zuul
uid=923(zuul) gid=923(zuul) groups=923(zuul)

A find / -uid 497 and find / -gid 498 on all hosts also showed there were no files owned left the old UID.

Monitoring alerted and recovered. Triggering a "recheck" on Gerrit worked and got jenkins vote.

  • disable puppet
  • stop services
  • chown -R 923:923 /srv/zuul/git /var/lib/zuul
  • chown -R 923:923 /var/log/zuul_repack/ /var/log/zuul/
  • chown -R 923:923 /etc/zuul/
  • sudo chown root:root /var/lib/zuul/{.gitconfig,git-template*}
  • re-enable puppet, it corrects /var/log/zuul to zuul:adm
  • puppet does the actual change for zuul user name to point to 923 instead of 497 (and 923 instead og 498 for gid)
  • id zuul (to verify)
  • start service
  • find / -uid 497 (to verify)
  • find / -gid 498 (to verify)

Mentioned in SAL (#wikimedia-operations) [2023-05-23T23:30:05Z] <mutante> contint*, releases* - maintenance - changing UID of jenkins user - jenkins will be stopped for a little bit, releases-jenkins is first though - T324659

Change 917919 merged by Dzahn:

[operations/puppet@production] jenkins: switch to fixed uid/gid 924

https://gerrit.wikimedia.org/r/917919

after carefully deploying the patch above to change jenkins UID/GID, following the instructions, changing file ownership etc (details on gerrit comments), we now have:

[contint2001:~] $ id jenkins
uid=924(jenkins) gid=924(jenkins) groups=924(jenkins)

[contint2002:~] $ id jenkins
uid=924(jenkins) gid=924(jenkins) groups=924(jenkins)

[contint1002:/tmp] $ id jenkins
uid=924(jenkins) gid=924(jenkins) groups=924(jenkins)

[releases1002:~] $ id jenkins
uid=924(jenkins) gid=924(jenkins) groups=924(jenkins)

[releases2002:~] $ id jenkins
uid=924(jenkins) gid=924(jenkins) groups=924(jenkins)

so no more worrying about rsync and file ownership when migrating. yay!

Please also see T334517#8904608 for a plan on how to proceed with contint* upgrades.

Also today we upgraded PHP from 7.3 to 7.4 on all contint*, old and new.

@hashar This new machine is on buster. Somehow I thought we did bullseye from the start. I suggest we reimage it. See linked comment above on other ticket as well.

@hashar This new machine is on buster. Somehow I thought we did bullseye from the start. I suggest we reimage it. See linked comment above on other ticket as well.

I commented on it at T334517#8905950 and we can't realistically upgrade to Bullseye for the reasons mentioned in that comment.

For switching the services contint2001 (Buster) to the new hardware contint2002 (still Buster):

  • the primary blocker was being able to easily switch over which required fixed uid due to a limitation in how we run rsync and those steps have been accomplished above.
  • other steps to ease the switch over have been implemented last time (two years ago?) so it is mostly rsync, updating Puppet settings and updating DNS for contint.wikimedia.org

Then we can decommission the old hardware contint2001. Daniel proposed to reiimage it to Bullseye for testing the OS upgrade but we can test the applications in Docker images and/or WMCS so it is not necessary to delay the decommissioning.

I guess I will update the runbook draft at https://www.mediawiki.org/wiki/Continuous_integration/Data_center_switch

contint.wikimedia.org

minor correction, it's https://integration.wikimedia.org

Daniel proposed to reiimage it to Bullseye for testing the OS upgrade but we can test the applications in Docker images and/or WMCS so it is not necessary to delay the decommissioning.

The suggestion was more specifically to reimage old hardware to bullseye _after_ having switched to the new hardware.

It would still give us a status where no production services run on old hardware and what is running on old hardware is NOT on buster.

That would fix everything for us because we are merely interested in not having buster machines around and not running prod services on hardware out of warranty.

Using the old hardware for testing things on bullseye would be just fine.

Change 933196 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/dns@master] switch contint.wikimedia.org from contint2001 to contint2002

https://gerrit.wikimedia.org/r/933196

https://www.mediawiki.org/wiki/Continuous_integration/Data_center_switch exists but at least one thing is outdated. it mentions rsyncing "over ssh as root" and that's a thing that hasn't been working anymore in years. But the good part is.. when checking I found we already did puppetize the modern way of rsyncing.

Dzahn removed Dzahn as the assignee of this task.Jun 30 2023, 10:37 PM

At first sorry for the large delay, the last 2/3 weeks have been pretty much jumping from an interrupt to another one.

@Jelto reached out to me yesterday about it to sync up on when we can do the switch and we talked about it during the SRE-collab / releng sync up meeting today.

Daniel has nicely improved the process (notably the DNS discovery entry) and addressed the rsync issue which messed up file ownership (by using fixed UID/GID).
Daniel wrote Puppet patches which can be found via https://gerrit.wikimedia.org/r/q/bug:T324659. One for the DNS change, another to adjust Zuul configuration

The preparation steps:

  • Figure out the rsync commands to move the states directory: /srv/jenkins/builds /var/lib/jenkins and /var/lib/zuul/times (see eg T224591#6039192).
  • Update the runbook at https://www.mediawiki.org/wiki/Continuous_integration/Data_center_switch which will make the switch easier. The sequence can be added to this task description (similar to T224591).
  • a Puppet patch to disable Zuul/Jenkins on the old host and enable it on the new host. IIRC it is driven by Hiera settings

As for scheduling, it is easier to do it during the European morning since CI is less active. We talked about Tuesday 8am UTC / 10 am CEST which is after the automatic MediaWiki train deployment and immediately after the UTC morning backport window.

The transfer of /srv/jenkins/builds takes a bit of time since there are a few hundred megabytes of data to move around. We should do one big sync before the switch, probably the day before. At some point on Monday, possibly an extra one on Tuesday morning before the actual switch to catch up with builds that have happened in between.

Change 935919 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/puppet@production] ci/zuul: set contint2002 as the active ci::manager_host

https://gerrit.wikimedia.org/r/935919

Thanks @hashar for the detailed summary!

Regarding rsync the following commands should be needed (executed on contint2001, because contint2002 has rsyncd configured, see /etc/rsync.d/ on both hosts):

rsync -avpn --bwlimit=50000 /srv/jenkins/ rsync://contint2002.wikimedia.org/ci--srv-jenkins-
rsync -avpn --bwlimit=50000 /var/lib/jenkins/ rsync://contint2002.wikimedia.org/ci--var-lib-jenkins-
rsync -avpn --bwlimit=50000 /var/lib/zuul/ rsync://contint2002.wikimedia.org/ci--var-lib-zuul-

Note -n dry run mode is used here, for the sync we have to run the commands without dry run flag.
@Dzahn recommended to use a bandwidth limit to not saturate the 1G port.

/srv/jenkins takes the most time in dry run. The dry run estimate additional 95MB are submitted with a total size of 155GB.
Other two folders /var/lib/jenkins and /var/lib/zuul are small (2GB and 80kb total).

Regarding disabling the services. I spoke with @Dzahn and we most probably disable services (jenkins, zuul, gearman) manually during the maintenance window and while puppet is disabled. After that we have to find the right order and combination of merging the switchover tasks https://gerrit.wikimedia.org/r/q/topic:contint2002 to make sure services run on contin2002 are started and stay stopped on contint2001.

Thanks for the rsync commands!

Some adjustements:

  • delete files on the destination with: --delete-delay
  • swap the very verbose -v with the fancier --info=progress2.
  • We only need /var/lib/zuul/times:
    • rsync --info=progress2 -apn --delete-delay --bwlimit=50000 /var/lib/zuul/times/ rsync://contint2002.wikimedia.org/ci--var-lib-zuul-/times

The bandwidth limit at 50000 would mean an hour to transfer the 200GB if I get it right. Then that can be pre warmed ahead of the maintenance window. I think most of the time will be spend in disk I/O / crawling the disk.

For /srv/jenkins I get a noticeable speed up by using compression (-zz).

I have done a first initial transfer of /srv/jenkins since I wanted to have a rough estimate of how long it took and surely wanted to tweak rsync parameters.

The transfer took maybe 2 or 3 hours and is capped at roughly 10MB/second:

contint2001_rsync_srv_jenkins.png (299×916 px, 37 KB)

The reason is the Jenkins builds are mostly small files and there is thus a lot of overhead opening/reading/closing each of them. I have thus removed --bwlimit parameter.

I went enabling compression (-zz) but I don't think it matters to enable it. There were a few huge raw text log files which are no compressed (integration/config #936029).

I don't think there will be any delta transfer since once a file has been created by Jenkins it is unlikely to change, though the permalinks files for each job do get altered. They are small enough that we can just transfer them in whole and thus I have added --whole-file which disables the delta check.

Which gives me:

rsync -n -ap --whole-file --delete-delay --info=progress2 /srv/jenkins/ rsync://contint2002.wikimedia.org/ci--srv-jenkins-

That ran in ~ 4 minutes :-]

Ran it again on Friday morning took 2 minutes 30 seconds:

$ date; time sudo rsync -ap --whole-file --delete-delay --info=progress2 /srv/jenkins/ rsync://contint2002.wikimedia.org/ci--srv-jenkins-; date;
Fri 07 Jul 2023 08:36:18 AM UTC
  9,306,733,533   6%   59.67MB/s    0:02:28 (xfr#77548, to-chk=0/1562935)   

real	2m31.078s
user	0m35.668s
sys	0m13.886s
Fri 07 Jul 2023 08:38:49 AM UTC

For switching the services via Puppet, that is nowadays done via single Hiera variable (introduced by Puppet #908232):

hieradata/role/common/ci.yaml
profile::ci::manager_host: contint2001.wikimedia.org

Jelto wrote the patch: https://gerrit.wikimedia.org/r/c/operations/puppet/+/935919

Thanks for testing and running rsync!

I created a rough checklist in the task description. Feel free to edit if I missed anything. I'm not sure if we use the -whole-file --delete-delay for all three rsync commands.

Change 936266 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/puppet@production] contint: move zuul-merger from contint2001 to contint2002

https://gerrit.wikimedia.org/r/936266

Thanks for testing and running rsync!

I created a rough checklist in the task description. Feel free to edit if I missed anything.

Awesome!

I will make some changes to have the switch a little bit more driven by Puppet. There are configuration files which needs to be updated and Puppet will bring down/up the services for us.

I'm not sure if we use the --whole-file --delete-delay for all three rsync commands.

We definitely need to --delete files that are no more present. For --whole-file I don't think it changes any thing since files are small in practices, or never change once written (Jenkins build results). Intuitively it sounds cheaper to transfer them entirely and save the time of doing the delta check.

Fun Friday finding: neither contint1002 (recently moved) nor contint2002 where allowed to ssh to the integration project due to security rule. I have updated them :)

Change 936266 merged by Jelto:

[operations/puppet@production] contint: move zuul-merger from contint2001 to contint2002

https://gerrit.wikimedia.org/r/936266

Icinga downtime and Alertmanager silence (ID=fb9b83f1-475c-4737-a872-7868377e05ee) set by jelto@cumin1001 for 1:00:00 on 1 host(s) and their services with reason: Switch contint hosts for hardware replacement

contint2001.wikimedia.org

Icinga downtime and Alertmanager silence (ID=af763fea-db6d-494f-8c4c-8139c0ceab0c) set by jelto@cumin1001 for 1:00:00 on 1 host(s) and their services with reason: Switch contint hosts for hardware replacement

contint2002.wikimedia.org

Change 933196 merged by Jelto:

[operations/dns@master] switch contint.wikimedia.org from contint2001 to contint2002

https://gerrit.wikimedia.org/r/933196

Change 867705 merged by Jelto:

[operations/puppet@production] ci/zuul: switch gearman server from contint2001 to contint2002

https://gerrit.wikimedia.org/r/867705

Change 935919 merged by Jelto:

[operations/puppet@production] ci/zuul: set contint2002 as the active ci::manager_host

https://gerrit.wikimedia.org/r/935919

Mentioned in SAL (#wikimedia-releng) [2023-07-11T08:45:28Z] <hashar> integration: removed security rules allowing contint2001 [208.80.153.15] on integration, puppet-diffs and deployment-prep: services got moved to contint2002 | T324659

Change 867712 merged by Jelto:

[operations/puppet@production] ci: make contint2002 the new rsync source, remove contint2001

https://gerrit.wikimedia.org/r/867712

Just because I first saw that error after CI came back from maintenance: do you think there’s any chance that this caused T341556: CentralAuthExtensionJsonTest::testHookHandler with data set #11 ('securepoll') failing in Wikidata.org CI?

Just because I first saw that error after CI came back from maintenance: do you think there’s any chance that this caused T341556: CentralAuthExtensionJsonTest::testHookHandler with data set #11 ('securepoll') failing in Wikidata.org CI?

Possibly but I find it unlikely. I will follow up on T341556.

Mentioned in SAL (#wikimedia-operations) [2023-07-11T09:43:56Z] <hashar> Updating Zuul configuration which was stall to a version from March 29th after the switchover from contint2001 to contint2002 | T324659 T341556

hashar added a subscriber: daniel.

As far as I can tell, the services were successfully switched over from contint2001 to contint2002. I missed updating the Zuul scheduler configuration which caused a few dozen of builds to break ( T341556 ) but that got caught and fixed immediately after finding).

The left over for later are:

Thanks @daniel for all the preliminary steps done on the CI front and @Jelto for the excellent initial runbook and pairing the switch today :-]

Change 937068 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/config@master] Switch Jenkins maintenance jobs to contint2002

https://gerrit.wikimedia.org/r/937068

Change 937069 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/config@master] jjb: switch jobs from contint2001 to contint2002

https://gerrit.wikimedia.org/r/937069

Change 937068 merged by jenkins-bot:

[integration/config@master] Switch Jenkins maintenance jobs to contint2002

https://gerrit.wikimedia.org/r/937068

Mentioned in SAL (#wikimedia-releng) [2023-07-11T11:12:21Z] <hashar> integration: applied label contint2001 to agent contint2002 to let it processes pending jobs after the host switch over (T324659). Jobs updated by https://gerrit.wikimedia.org/r/937068 and https://gerrit.wikimedia.org/r/937069

Change 937069 merged by jenkins-bot:

[integration/config@master] jjb: switch jobs from contint2001 to contint2002

https://gerrit.wikimedia.org/r/937069