Page MenuHomePhabricator

Replace deployment-fluorine02 with a Buster host
Closed, ResolvedPublic

Description

deployment-fluorine02.deployment-prep.eqiad.wmflabs is a Jessie host. It should be replaced with a Buster host.

Event Timeline

Majavah triaged this task as Medium priority.Mar 4 2021, 7:55 AM
Majavah created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I created deployment-mwlog01 as a Buster host. fluorine02 is a disk80 host but the log partition is fairly small:

/dev/mapper/vd-second--local--disk   60G  656M   56G   2% /srv

I made a 2G Cinder volume for that, given that the current logs are only about 650M it should be large enough. Also mounted it to /srv on the new VM.

I tried to apply the udp2log role to the new VM but it's failing because udplog package is not available for Buster yet. Since prod SRE is currently updating mwlog* hosts to buster (T224565: Migrate mwlog/udp2log servers to Buster) I'm hopeful that it will be fixed fairly soon. For now it does not have any roles to avoid puppet errors.

I tried to apply the udp2log role to the new VM but it's failing because udplog package is not available for Buster yet. Since prod SRE is currently updating mwlog* hosts to buster (T224565: Migrate mwlog/udp2log servers to Buster) I'm hopeful that it will be fixed fairly soon. For now it does not have any roles to avoid puppet errors.

Please file a dedicated task for this and tag SRE, no reason it can't be done earlier.

Please file a dedicated task for this and tag SRE, no reason it can't be done earlier.

Done, T276421: Package udplog for Buster. Thanks.

Change 668338 had a related patch set uploaded (by Majavah; owner: Majavah):
[operations/mediawiki-config@master] [DNM] Switch to deployment-mwlog01

https://gerrit.wikimedia.org/r/668338

Now that udplog is available on Buster, next I'll try live hacking https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/668338/, check that logs will be saved to deployment-mwlog01 and delivered to logstash-beta, and if it works get that patch properly merged

Mentioned in SAL (#wikimedia-releng) [2021-03-04T11:02:21Z] <Majavah> live hacking https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/668338/ on deployment-deploy01 to test new deployment-mwlog01 ref T276419

Live hacks reverted. I confirmed that mwlog01 receives events and logs are still flowing to logstash-beta.wmflabs.org. Scheduled the mediawiki-config patch for today's EU backport window.

Change 668338 merged by jenkins-bot:
[operations/mediawiki-config@master] betacluster: Switch udp2log to deployment-mwlog01

https://gerrit.wikimedia.org/r/668338

Change 668384 had a related patch set uploaded (by Majavah; owner: Majavah):
[operations/puppet@production] scap: Swap betacluster udplog deployment-mwlog01

https://gerrit.wikimedia.org/r/668384

Change 668386 had a related patch set uploaded (by Majavah; owner: Majavah):
[operations/puppet@production] udp2log: Swap beta cluster to deployment-mwlog01

https://gerrit.wikimedia.org/r/668386

Change 668386 merged by Jbond:
[operations/puppet@production] udp2log: Swap beta cluster to deployment-mwlog01

https://gerrit.wikimedia.org/r/668386

Change 668384 merged by Jbond:
[operations/puppet@production] scap: Swap betacluster udplog deployment-mwlog01

https://gerrit.wikimedia.org/r/668384

Mentioned in SAL (#wikimedia-releng) [2021-03-04T12:38:10Z] <Majavah> git rebase origin/production on deployment-puppetmaster04 to update few settings for T276419

Change 668392 had a related patch set uploaded (by Majavah; owner: Majavah):
[operations/puppet@production] arclamp: Switch beta cluster redis to deployment-webperf01

https://gerrit.wikimedia.org/r/668392

Change 668392 merged by Muehlenhoff:
[operations/puppet@production] arclamp: Switch beta cluster redis to deployment-webperf01

https://gerrit.wikimedia.org/r/668392

Mentioned in SAL (#wikimedia-releng) [2021-03-04T13:18:14Z] <Majavah> shutdown deployment-fluorine02 for a scream test for T276419, I believe everything has been moved to deployment-mwlog01

Mentioned in SAL (#wikimedia-releng) [2021-03-11T16:57:58Z] <Majavah> copy a tarball of deployment-fluorine02 /home to deployment-mwlog01 root home dir, delete deployment-fluorine02 T276419