Page MenuHomePhabricator

add bullseye support to deployment server puppet role - upgrade deployment server in devtools
Closed, ResolvedPublic

Description

The puppet role deployment_server should have support for bullseye, for usage at least in cloud VPS projects and also generally to get rid of deployment hosts on buster in testing and production.

Currently when trying it on a bullseye VM the issues include:

  • E: Unable to locate package python-redis
  • E: Unable to locate package python-gitdb
  • E: Package 'python-git' has no installation candidate

My subteam would like this for cloud VPS devtools project to resolve T360964 but I assume the production deployment servers should also be upgraded. And they are also still on buster so we couldn't copy production.

I am currently unsure if this ticket should only be about adding support to the puppet role or if it should also include actual upgrade of the production machines using it. serviceops, any opinion?

(edit: new ticket T364656 is for production servers. this ticket is limited to adding the support and replacing the cloud VPS)

Event Timeline

Change #1023954 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] redis: use python3-redis to support bullseye

https://gerrit.wikimedia.org/r/1023954

Change #1023955 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] deployment_server: add bullseye support, python3 package names

https://gerrit.wikimedia.org/r/1023955

Change #1023954 merged by Dzahn:

[operations/puppet@production] redis: use python3-redis to support bullseye

https://gerrit.wikimedia.org/r/1023954

Change #1024447 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] deployment_server: stop including redis::client::python

https://gerrit.wikimedia.org/r/1024447

Dzahn changed the task status from Open to In Progress.Apr 26 2024, 10:52 PM
Dzahn claimed this task.

Change #1023955 merged by Dzahn:

[operations/puppet@production] deployment_server: stop installing python-gitdb, python-git

https://gerrit.wikimedia.org/r/1023955

Change #1024447 merged by Dzahn:

[operations/puppet@production] deployment_server: stop including redis::client::python

https://gerrit.wikimedia.org/r/1024447

A list of errors that show up with a fresh deployment server on bullseye so far:

error: E: Unable to locate package python-redis
fix: stop including redis::client::python

error: E: Unable to locate package python-gitdb
fix: stop installing python-gitdb, python-git

error: E: Package 'python-git' has no installation candidate
fix: stop installing python-gitdb, python-git

error: Scap::Source[releng/jenkins-deploy] - /usr/bin/scap deploy --init' returned 1 -
fix: manual: cd /srv/deployment/releng/jenkins-deploy/ ; scap deploy --init -Dblock_deployments:False

error: Scap::Source[integration/docroot] - /usr/bin/scap deploy --init' returned 1 -
fix: manual: cd /srv/deployment/integration/docroot/ ; scap deploy --init -Dblock_deployments:False

error: Scap::Source[phabricator/deployment] - /usr/bin/scap deploy --init' returned 1 -
fix: manual: cd /srv/deployment/phabricator/deployment ; scap deploy --init -Dblock_deployments:False

error: Scap_source[gervert/deploy] - /usr/bin/scap deploy --init' returned 1 -
error: RuntimeError: Did not find a file named gerrit in search path: ['/srv/deployment/gervert/deploy/scap', '/etc/dsh/group'] ✗

fix: cd /etc/dsh/group ; sudo touch gerrit (?)
error: 22:22:21 No targets selected, check limits and dsh_targets ✗
fix: ? ✗

error: systemd[1]: Starting "Mw-cgroup"... /sys/fs/cgroup/memory/release_agent: Permission denied.. Failed to start "Mw-cgroup".
fix: reboot the VM to apply grub config change -> T363957

error: Geoip::Data::Puppet/File[/usr/share/GeoIP]: Failed to generate additional resources using 'eval_generate': Error 500 on SERVER: Server Error: Not authorized to call search on /file_metadata/volatile/GeoIP
fix: ? ✗

Change #1026193 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] mediawiki/geoip: make it optional to load geoip data from puppetserver

https://gerrit.wikimedia.org/r/1026193

Change #1026198 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] devtools: remove gervert from deployed repos in cloud VPS

https://gerrit.wikimedia.org/r/1026198

Change #1026198 merged by Dzahn:

[operations/puppet@production] devtools: remove gervert from deployed repos in cloud VPS

https://gerrit.wikimedia.org/r/1026198

Change #1026698 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] cloud/devtools: replace deploy-1004 with deploy-1006

https://gerrit.wikimedia.org/r/1026698

Change #1026698 merged by Dzahn:

[operations/puppet@production] cloud/devtools: replace deploy-1004 with deploy-1006

https://gerrit.wikimedia.org/r/1026698

Mentioned in SAL (#wikimedia-cloud) [2024-05-02T23:10:29Z] <mutante> replacing deploy-1004 (buster) with deploy-1006 (bullseye) as new deployment server in both repo and Horizon hiera T360964 T363415

Mentioned in SAL (#wikimedia-cloud) [2024-05-02T23:52:24Z] <mutante> switching puppetmaster for deploy-1006 back to local project puppetmaster; rm -rf /var/lib/puppet/ssl that still referred to puppetmaster-1001, signing new request on puppetmaster-1003 T360470 T363415

LSobanski triaged this task as Medium priority.May 7 2024, 3:19 PM
LSobanski moved this task from Incoming to Work in Progress on the collaboration-services board.
Dzahn renamed this task from upgrade deployment servers to bullseye / add bullseye support to puppet role to add bullseye support to deployment server puppet role - upgrade deployment server in devtools.May 7 2024, 4:05 PM

T364656 will be about upgrading / replacing the production deployment servers.

This ticket is about prep work to make the puppet role support bullseye and upgrading the host in our staging project in cloud VPS.

This still needs https://gerrit.wikimedia.org/r/c/operations/puppet/+/1026193 to be merged to be able to call it resolved.

It's actually not an issue of the deployment server itself but it seems like problem that was introduced as part of T351450 / T351451 / T360470 when puppetmasters in cloud VPS were upgraded to puppet 7 /puppetserver. We are missing volatile setup since then.