Ensure WDQS stack works on Bullseye
Closed, ResolvedPublic5 Estimated Story Points
Actions

Assigned To

Authored By

	bking
	Mar 6 2023, 3:15 PM

Description

Per parent ticket, we must migrate away from Buster by September 2023. Creating this ticket to:

Test operation of the current stack on a Bullseye host
If necessary, update Puppet and other parts of the stack to ensure the WCQS/WDQS stack works on newer versions of Debian.

Details

Subject	Repo	Branch	Lines +/-
wdqs.data-transfer: ensure data_loaded file is created	operations/cookbooks	master	+1 -1
query_service: install git-fat	operations/puppet	production	+3 -0
query_service: Permit python2 on bullseye	operations/puppet	production	+3 -0
wdqs: Activate wdqs2021	operations/puppet	production	+6 -1
wdqs: Add wdqs2022 as scap target	operations/puppet	production	+1 -0
wdqs: add wdqs2022 to conftool	operations/puppet	production	+1 -0
wdqs: use proper YAML variable type	operations/puppet	production	+2 -2
wdqs: correct profile logic	operations/puppet	production	+1 -1
wdqs: use newer java profile for wdqs	operations/puppet	production	+8 -0
wdqs: activate wdqs2022	operations/puppet	production	+1 -1

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Open	None	T291916 Tracking task for Bullseye migrations in production
Resolved	Gehel	T323921 [Epic] Migrate all Search Platform servers to Debian Bullseye
Resolved	bking	T332314 Service implementation for wdqs20[13-22]
Resolved	bking	T331300 Ensure WDQS stack works on Bullseye
Resolved	bking	T336540 Ensure prometheus-blazegraph-exporter-wdqs-* services can start in Bullseye or later
Resolved	bking	T336443 Investigate performance differences between wdqs2022 and older hosts
Resolved	VRiley-WMF	T358727 Reclaim recently-decommed CP host for WDQS (see T352253)
Resolved	bking	T340793 Implement depool (source only) and keep-downtime options on data-transfer cookbook
Resolved	Gehel	T342060 Investigate WDQS categories update failures on Bullseye hosts
In Progress	Sandeeps	T342162 "scap deploy"'s config-deploy should check for broken symlinks

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Mentioned in SAL (#wikimedia-operations) [2023-04-20T19:16:40Z] <inflatador> bking@cumin1001 depool wdqs2012.codfw.wmnet for data xfer T331300

Maintenance_bot removed a project: Patch-For-Review.Apr 20 2023, 7:22 PM

Mentioned in SAL (#wikimedia-operations) [2023-04-20T21:18:27Z] <inflatador> bking@cumin1001 depool wdqs2009 for data xfer T331300

Mentioned in SAL (#wikimedia-operations) [2023-04-20T21:22:44Z] <inflatador> bking@cumin1001 repool wdqs2012 T331300

Icinga downtime and Alertmanager silence (ID=09a1e24c-01d3-42a5-8179-085a01f32aae) set by bking@cumin1001 for 1 day, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2009.codfw.wmnet

Icinga downtime and Alertmanager silence (ID=53c1ca57-f03f-405d-8e63-67add663e004) set by bking@cumin1001 for 2 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2006.codfw.wmnet

Icinga downtime and Alertmanager silence (ID=45b4cbdf-ff0a-48b7-9921-640f9106f396) set by bking@cumin1001 for 2 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2012.codfw.wmnet

We're examining wdqs2022, where we have completed the transfer of /srv/wdqs/ yet blazegraph is not starting.

/usr/lib/libjvmquake.so is giving issues which appear to be preventing wdqs-blazegraph from starting:

Apr 26 21:13:44 wdqs2022 wdqs-blazegraph[1710982]: (jvmquake) using options: threshold=[300s],runtime_weight=[5:1],action=[JVM OOM]
Apr 26 21:13:44 wdqs2022 wdqs-blazegraph[1710982]: agent library failed to init: /usr/lib/libjvmquake.so

Here's wdqs2022 (bullseye)'s package info as opposed to wdqs2004 (buster):

ryankemper@wdqs2004:~$ java -version
openjdk version "1.8.0_362"
OpenJDK Runtime Environment (build 1.8.0_362-8u362-ga-4~deb10u1-b09)
OpenJDK 64-Bit Server VM (build 25.362-b09, mixed mode)

ryankemper@wdqs2004:~$ dpkg -l | grep jvmquake
ii  jvmquake                             1.0.1-1+deb10u1              amd64        A JVMTI agent that kills your JVM when things go sideways

ryankemper@wdqs2022:~$ java -version
openjdk version "1.8.0_362"
OpenJDK Runtime Environment (build 1.8.0_362-8u362-ga-4~deb11u1-b09)
OpenJDK 64-Bit Server VM (build 25.362-b09, mixed mode)

ryankemper@wdqs2022:~$ dpkg -l | grep jvmquake
ii  jvmquake                             1.0.1-1+deb11u1                amd64        A JVMTI agent that kills your JVM when things go sideways

Here's what @dcausse and I did at today's pairing session:

Realized that the jvmquake package for Bullseye is build against Java 11, whereas we need Java 8
Removed the Java-11-built-jvmquake packages from main bullseye-wikimedia Debian repo and copied Java-8-built-jvmquake packages from main wikimedia-buster to main wikimedia-bullseye repo.

Our next attempt to start wdqs-blazegraph.service on wdqs2022 met with a new error:
Error: Could not find or load main class org.eclipse.jetty.runner.Runner

We believe that this is caused by an incomplete scap deploy. So our next steps will be:

Attempt to redeploy via scap on wdqs2022 only
Request a new jvmquake package built on Bullseye against Java 8; this is mainly to rule out library linking issues.

Gehel added a subtask: T335514: Build JVMQuake Package for Bullseye + Java 8.May 1 2023, 8:17 AM

Gehel set the point value for this task to 5.May 1 2023, 3:26 PM

bking claimed this task.May 1 2023, 3:26 PM

Gehel added a project: Data-Platform-SRE.May 2 2023, 8:27 AM

Gehel moved this task from Incoming to In Progress on the Data-Platform-SRE board.

Icinga downtime and Alertmanager silence (ID=51e1d4cd-32ce-4dfa-ae82-91dd8c3e940b) set by bking@cumin1001 for 12 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2022.codfw.wmnet

Change 914381 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] wdqs: add wdqs2022 to conftool

https://gerrit.wikimedia.org/r/914381

gerritbot added a project: Patch-For-Review.May 2 2023, 6:42 PM

Change 914381 merged by Bking:

[operations/puppet@production] wdqs: add wdqs2022 to conftool

https://gerrit.wikimedia.org/r/914381

Change 914384 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] wdqs: Add wdqs2022 as scap target

https://gerrit.wikimedia.org/r/914384

Change 914384 merged by Bking:

[operations/puppet@production] wdqs: Add wdqs2022 as scap target

https://gerrit.wikimedia.org/r/914384

Icinga downtime and Alertmanager silence (ID=9386abfe-15eb-44c9-befd-5bcc42b22df7) set by bking@cumin1001 for 14 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2022.codfw.wmnet

Maintenance_bot removed a project: Patch-For-Review.May 2 2023, 7:10 PM

We were unable to deploy because the git-fat package was not available for Bullseye. We copied it from the Buster repo using these instructions .

After installing git-fat and re-running Puppet, we were able to get the wdqs services to start cleanly. Our test queries also passed.

At this point, I believe we've confirmed that the WDQS stack runs on Bullseye. But we do want to leave this open until @dcausse returns next week so he can validate as well.

bking moved this task from In Progress to Needs Review on the Data-Platform-SRE board.May 8 2023, 3:23 PM

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin1001 for host wdqs2021.codfw.wmnet with OS bullseye

Change 918597 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] [WIP]wdqs: Activate wdqs2021

https://gerrit.wikimedia.org/r/918597

gerritbot added a project: Patch-For-Review.May 10 2023, 8:35 PM

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin1001 for host wdqs2021.codfw.wmnet with OS bullseye executed with errors:

wdqs2021 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin1001 for host wdqs2021.codfw.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin1001 for host wdqs2021.codfw.wmnet with OS buster executed with errors:

wdqs2021 (FAIL)
- The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin1001 for host wdqs2021.codfw.wmnet with OS buster

bking mentioned this in T336443: Investigate performance differences between wdqs2022 and older hosts.May 10 2023, 9:39 PM

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin1001 for host wdqs2021.codfw.wmnet with OS buster completed:

wdqs2021 (WARN)
- Downtimed on Icinga/Alertmanager
- Unable to disable Puppet, the host may have been unreachable
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh buster OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202305102152_bking_3857978_wdqs2021.out
- Checked BIOS boot parameters are back to normal
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB

Change 918597 merged by Ryan Kemper:

[operations/puppet@production] wdqs: Activate wdqs2021

https://gerrit.wikimedia.org/r/918597

We've noticed that on the bullseye hosts, the blazegraph prometheus exporters are in a restart loop, ultimately [likely] due to differing python versions breaking the current implementation of our exporter script.

Python 3 version differs between OS versions:

(buster)

ryankemper@wdqs1010:~$ python3 -V
Python 3.7.3

versus
(bullseye)

ryankemper@wdqs2022:~$ python3 -V
Python 3.9.2

Gehel renamed this task from Ensure WCQS/WDQS stack works on Bullseye and later to Ensure WCQS/WDQS stack works on Bullseye.May 11 2023, 6:59 PM

Icinga downtime and Alertmanager silence (ID=2ba9c32a-8bbc-4b94-9ca0-1fbeeaee45e7) set by bking@cumin1001 for 14 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2021.codfw.wmnet

Maintenance_bot removed a project: Patch-For-Review.May 11 2023, 7:10 PM

Gehel added a subtask: T336443: Investigate performance differences between wdqs2022 and older hosts.May 15 2023, 3:31 PM

We noticed errors deploying the latest wdqs version to Bullseye:

  File "/var/lib/scap/scap/lib/python3.9/site-packages/scap/runcmd.py", line 91, in gitcmd
    return _runcmd(["git", subcommand] + list(args), **kwargs)
  File "/var/lib/scap/scap/lib/python3.9/site-packages/scap/runcmd.py", line 78, in _runcmd
    raise FailedCommand(argv, p.returncode, stdout, stderr)
scap.runcmd.FailedCommand: Command 'git fat init' failed with exit code 1;                              stdout:

19:08:57 [wdqs2022.codfw.wmnet] deploy-local failed: <FailedCommand> {'exitcode': 1, 'stdout': '', 'stde
rr': "git: 'fat' is not a git command. See 'git --help'.\n\nThe most similar commands are\n\tfetch\n\tmk
tag\n\tstage\n\tstash\n\ttag\n\tvar\n"}

As previously mentioned, we were missing git-fat from wdqs2022. To fix this issue, we reinstalled it and ran git fat init and git fat pull from the WDQS deploy directory on wdqs2022. Git-fat uses Python2 and puppet is configured to remove python2-related packages , so we'll have to find a way to work around this.

Note that the Search Platform team has already been asked to replace git-fat with git-lfs . I'm not sure how quickly that is going to happen, so we probably want to update our puppet code to allow python2 on our wdqs bullseye hosts.

Change 920365 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] query_service: Permit python2 on bullseye

https://gerrit.wikimedia.org/r/920365

gerritbot added a project: Patch-For-Review.May 16 2023, 6:50 PM

Icinga downtime and Alertmanager silence (ID=edd7680d-1c73-4a29-8687-f2061fd84b57) set by bking@cumin1001 for 12 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2012.codfw.wmnet

EBernhardson moved this task from In Progress to Ready for Dev -- SRE/Ops on the Discovery-Search (Current work) board.May 22 2023, 3:24 PM

EBernhardson moved this task from Ready for Dev -- SRE/Ops to In Progress on the Discovery-Search (Current work) board.

Change 920365 merged by Ryan Kemper:

[operations/puppet@production] query_service: Permit python2 on bullseye

https://gerrit.wikimedia.org/r/920365

Maintenance_bot removed a project: Patch-For-Review.May 22 2023, 9:11 PM

Icinga downtime and Alertmanager silence (ID=389b7357-bed5-4b2f-8790-8d67f9ff7609) set by bking@cumin1001 for 20 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2021.codfw.wmnet

Gehel removed a subtask: T335514: Build JVMQuake Package for Bullseye + Java 8.Jun 5 2023, 3:22 PM

Icinga downtime and Alertmanager silence (ID=6de74cbd-41f5-48dd-9b28-0f5e924b361a) set by bking@cumin1001 for 20 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2012.codfw.wmnet

Icinga downtime and Alertmanager silence (ID=e631aacc-d9c4-4fd4-a12f-2ac8dc01ccf2) set by bking@cumin1001 for 20 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs1016.eqiad.wmnet

Icinga downtime and Alertmanager silence (ID=e90a333a-c22a-4294-bcc7-6a7665a57f08) set by bking@cumin1001 for 20 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs1016.eqiad.wmnet

Icinga downtime and Alertmanager silence (ID=60cdd970-8441-43c9-b77b-84c7443210ea) set by bking@cumin1001 for 20 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2012.codfw.wmnet

Icinga downtime and Alertmanager silence (ID=daffcf81-a026-4a44-bbd6-5cb2dda2365c) set by bking@cumin1001 for 20 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2012.codfw.wmnet

Icinga downtime and Alertmanager silence (ID=72d89466-5f1c-4a18-8dd7-27b6fb931b75) set by bking@cumin1001 for 20 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2021.codfw.wmnet

Gehel moved this task from Needs Review to In Progress on the Data-Platform-SRE board.Jun 13 2023, 3:31 PM

Icinga downtime and Alertmanager silence (ID=f835fbe1-5ecd-477d-9755-a6556d5b9287) set by bking@cumin1001 for 20 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2021.codfw.wmnet

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin1001 for host wdqs2021.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin1001 for host wdqs2021.codfw.wmnet with OS bullseye executed with errors:

wdqs2021 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by bking@cumin1001 for host wdqs2021.codfw.wmnet with OS bullseye

Icinga downtime and Alertmanager silence (ID=f4cd972b-d9a2-4ac2-866f-e100023e5f8d) set by bking@cumin1001 for 20 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2022.codfw.wmnet

We're still having problems with our first Bullseye host, wdqs2022. After successfully* transferring the wdqs data via the data-transfer cookbook, the wdqs-categories and wdqs-blazegraph services can start , but wdqs-updater.service fails. The unit file calls a bash script with arguments as follows:

/bin/bash /srv/deployment/wdqs/wdqs/runStreamingUpdater.sh -n wdq -- --brokers kafka-main2001.codfw.wmnet:9092,kafka-main2002.codfw.wmnet:9092,kafka-main2003.codfw.wmnet:9092,kafka-main2004.codfw.wmnet:9092,kafka-main2005.codfw.wmnet:9092 --consumerGroup wdqs2022 --topic codfw.rdf-streaming-updater.mutation --batchSize 250

I ran this command manually and I've captured a stacktrace here . Running the test.sh script from the rdf repo also returns a 503, suggesting that the data transferred via the cookbook might be corrupt.

*"successfully," as in, "the cookbook ran without errors"

Looking at P49427, it seems that there is an issue with the logging configuration. This should not prevent the system from starting, but might affect logging.

02:05:41,152 |-ERROR in ch.qos.logback.core.model.processor.ImplicitModelHandler - Could not create component [filter] of type [org.wikidata.query.rdf.common.log.PerLoggerThrottler] java.lang.ClassNotFoundException: org.wikidata.query.rdf.common.log.PerLoggerThrottler

The message above indicates that some of the logging configuration references classes that should be available in the main Blazegraph binaries, but not in the updater (we need some additional filtering of logs in Blazegraph as it sometimes gets too verbose, but those should not be needed in the updater).

Looking at the Blazegraph logs (/var/log/wdqs/wdqs-blazegraph.log) , there seem to be an issue with file permissions (or file existence?) on /srv/wdqs/wikidata.jnl

01:33:14.910 [main] WARN  o.eclipse.jetty.webapp.WebAppContext - Failed startup of context o.e.j.w.WebAppContext@5d908d47{Bigdata,/bigdata,file:///tmp/jetty-localhost-9999-blazegraph-service-0.3.124.war-_bigdata-any-5007834296220894734.dir/webapp/,UNAVAILABLE}{file:///srv/deployment/wdqs/wdqs-cache/revs/41174d50f967ef9bf3e3d956059c075790561ed0/blazegraph-service-0.3.124.war} 
java.io.FileNotFoundException: /srv/wdqs/wikidata.jnl (Permission denied)
        at java.io.RandomAccessFile.open0(Native Method)
Wrapped by: java.lang.RuntimeException: file=/srv/wdqs/wikidata.jnl
        at com.bigdata.journal.FileMetadata.<init>(FileMetadata.java:1144)
Wrapped by: java.lang.RuntimeException: java.lang.RuntimeException: file=/srv/wdqs/wikidata.jnl
        at com.bigdata.rdf.sail.webapp.BigdataRDFServletContextListener.openIndexManager(BigdataRDFServletContextListener.java:816)

Cookbook cookbooks.sre.hosts.reimage started by bking@cumin1001 for host wdqs2021.codfw.wmnet with OS bullseye executed with errors:

wdqs2021 (FAIL)
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202306140150_bking_1429183_wdqs2021.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- The reimage failed, see the cookbook logs for the details

Icinga downtime and Alertmanager silence (ID=583ec76e-3af3-444b-8dcc-25a81545f149) set by bking@cumin1001 for 20 days, 0:00:00 on 1 host(s) and their services with reason: attempting WDQS stack on bullseye

wdqs2021.codfw.wmnet

Gehel closed subtask T336540: Ensure prometheus-blazegraph-exporter-wdqs-* services can start in Bullseye or later as Resolved.Jun 30 2023, 8:02 AM

Gehel closed subtask T336443: Investigate performance differences between wdqs2022 and older hosts as Resolved.Jun 30 2023, 8:11 AM

Leaving some notes before I step out for the weekend and forget everything. The Bullseye hosts are still not coming up without manual intervention beyond the data-transfer.

Puppet is not installing git-fat (required for deployment), but it's not removing it after a manual install, either. My best guess at this point is that when Puppet chokes on the prometheus exporters, it doesn't finish installing its packages. A weak theory, but the best one I have at the moment,.
Scap deploys targeting a single host ( tested with wdqs2020 ) succeed, but the service can't start. Manually invoking /bin/bash /srv/deployment/wdqs/wdqs/runBlazegraph.sh -f /etc/wdqs/RWStore.categories.properties shows Could not find or load main class org.eclipse.jetty.runner.Runner , and the /srv/deployment/wdqs directory is too small. Manually deleting the entire contents of /srv/deployment/wdqs/ and re-deploying via scap seems to fix this issue.

bking added a subtask: T340793: Implement depool (source only) and keep-downtime options on data-transfer cookbook.Jul 10 2023, 9:23 PM

bking added a subtask: T342060: Investigate WDQS categories update failures on Bullseye hosts.Jul 18 2023, 6:30 PM

thcipriani mentioned this in T342162: "scap deploy"'s config-deploy should check for broken symlinks.Jul 18 2023, 10:20 PM

Manual steps (see above) needs to be documented on wiki before we close this task.

bking renamed this task from Ensure WCQS/WDQS stack works on Bullseye to Ensure WDQS stack works on Bullseye.Jul 25 2023, 9:02 PM

Documentation has been updated, but there's an important piece missing: we never verified WCQS. Creating a separate ticket for that issue.

bking moved this task from In Progress to Done on the Data-Platform-SRE board.Jul 25 2023, 9:22 PM

bking moved this task from In Progress to Needs Reporting on the Discovery-Search (Current work) board.

Gehel closed this task as Resolved.Jul 28 2023, 9:55 AM

Gehel closed subtask T342060: Investigate WDQS categories update failures on Bullseye hosts as Resolved.

Gehel merged a task: T342701: Ensure WCQS stack works on Bullseye or later.Jul 31 2023, 12:52 PM

Gehel mentioned this in T343124: Migrate WDQS and WCQS servers to Debian Bullseye.Jul 31 2023, 12:55 PM

Mentioned in SAL (#wikimedia-operations) [2023-08-08T19:28:02Z] <bking@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wcqs[1001-1003].eqiad.wmnet with reason: T331300

Mentioned in SAL (#wikimedia-operations) [2023-08-08T19:28:18Z] <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wcqs[1001-1003].eqiad.wmnet with reason: T331300