@dpifke upgraded the webperf hosts in deployment-prep to Bullseye, the next step it is to migrate the production instances:
Upgrade procedure for Coal hosts:
- Test the role with Bullseye in deployment-prep and fix up potentially needed changes from OS changes (done by @dpifke
- Create new Bullseye VMs, initially with role(insetup)
- Add the new servers to the Kafka Ferm config (https://gerrit.wikimedia.org/r/c/operations/puppet/+/785117)
- Mask navtiming and coal on the old hosts, directly followed by
- Merge a patch to enable the webperf::processors_and_site role on the new hosts and enable them in the Scap config (https://gerrit.wikimedia.org/r/c/operations/puppet/+/785115). Best to reboot the servers after applying the role to rule any corner cases and ensure everything gets started in the correct order
- Failover the "performance" discovery records to point to the new hosts (https://gerrit.wikimedia.org/r/c/operations/dns/+/789079)
- Switch old servers to role(insetup) until their eventual decom (can wait a week just in case) (https://gerrit.wikimedia.org/r/c/operations/puppet/+/785116)
- Remove the old servers from the Kafka Ferm config
- Remove the old VMs
Upgrade procedure for Arclamp hosts:
- Create new Bullseye VMs (webperf1004/webperf2004)
- Merge rsync config to allow migration of Xenon data from old to new server: https://gerrit.wikimedia.org/r/c/operations/puppet/+/802752/
- Disable Puppet and mask the excimer-k8s-log.service, excimer-k8s-wall-log.service, excimer-log.service, excimer-wall-log.service services on webperf1002 and webperf2002
- Mask the arclamp_generate_metrics.timer, arclamp_compress_logs.timer, arclamp_generate_svgs.timer timers on webperf1002 and webperf2002
- Rsync/srv/xenon from the webperf1002 to webperf1004, and webperf2002 to webperf2004: rsync -avz rsync://webperf2002.codfw.wmnet:/xenon_migrate/* /srv/xenon and rsync -avz rsync://webperf1002.eqiad.wmnet:/xenon_migrate/* /srv/xenon
- Enable arclamp role on webperf1004/2004 https://gerrit.wikimedia.org/r/c/operations/puppet/+/802749
- Designate webperf1004 as the new primary arclamp node: https://gerrit.wikimedia.org/r/c/operations/puppet/+/802750 and https://gerrit.wikimedia.org/r/c/operations/puppet/+/804333
- Remove webperf1002/webperf2002