Page MenuHomePhabricator

Build Debian packages for Bookworm
Closed, ResolvedPublic

Description

Bitu should be deployed to production using deb packages.
Staging will remain on Git deployments.

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
OpenNone
OpenSLyngshede-WMF
ResolvedNone
OpenNone
ResolvedMarostegui
ResolvedAndrew
ResolvedMarostegui
ResolvedAndrew
DeclinedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedLadsgroup
DuplicateNone
Resolved Bstorm
DeclinedNone
Resolvedtaavi
ResolvedJdforrester-WMF
DeclinedNone
Openjijiki
OpenNone
OpenFeatureNone
StalledFeatureNone
OpenFeatureSLyngshede-WMF
OpenNone
OpenAndrew
OpenSLyngshede-WMF
ResolvedABran-WMF
Resolvedtaavi
OpenNone
OpenSLyngshede-WMF
ResolvedPRODUCTION ERRORTgr
OpenNone
Resolvedbd808
Resolvedyuvipanda
Resolvedbd808
Resolvedbd808
Resolvedbd808
Opentaavi
Resolvedtaavi
DeclinedNone
OpenNone
ResolvedSLyngshede-WMF
OpenNone
Opentaavi
ResolvedSLyngshede-WMF

Event Timeline

Change 956836 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] P:idm allow for installation via Debian packages.

https://gerrit.wikimedia.org/r/956836

Change 957669 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] WIP: P:idm switch idm2001 to Debian package

https://gerrit.wikimedia.org/r/957669

Plan for testing rollout of Debian packages:

Upgrade test to Bookworm:

Pre-update:

Reimage idm-test100

ssh cumin1001.eqiad.wmnet
sudo cookbook sre.hosts.reimage --os bookworm -t T340721 idm-test1001

IDM2001 upgrade

ssh cumin1001.eqiad.wmnet
sudo cumin 'idm2001.wikimedia.org' "disable-puppet 'bitu deb install - slyngshede'"
sudo cookbook sre.hosts.reimage --os bookworm -t T340721 idm2001

Switch over to IDM2001

WAIT AND ALLOW ANY BUG TO REVEAL THEMSELVES

IDM1001 Upgrade

ssh cumin1001.eqiad.wmnet
sudo cumin 'idm1001.wikimedia.org' "disable-puppet 'bitu deb install - slyngshede'"
sudo cookbook sre.hosts.reimage --os bookworm -t T340721 idm1001

Change 957674 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/dns@master] IDM Switchover

https://gerrit.wikimedia.org/r/957674

Change 957676 had a related patch set uploaded (by Slyngshede; author: Slyngshede):

[operations/puppet@production] IDM: Deploy deb to idm1001.

https://gerrit.wikimedia.org/r/957676

MoritzMuehlenhoff renamed this task from Build Debian packages for Bookwork to Build Debian packages for Bookworm.Sep 14 2023, 9:25 AM

Plan looks good to me.

Change 956836 merged by Slyngshede:

[operations/puppet@production] P:idm allow for installation via Debian packages.

https://gerrit.wikimedia.org/r/956836

Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1001 for host idm-test1001.wikimedia.org with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1001 for host idm-test1001.wikimedia.org with OS bookworm completed:

  • idm-test1001 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202309141210_slyngshede_31369_idm-test1001.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1001 for host idm-test1001.wikimedia.org with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1001 for host idm-test1001.wikimedia.org with OS bookworm completed:

  • idm-test1001 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202309151319_slyngshede_376144_idm-test1001.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Change 957669 merged by Slyngshede:

[operations/puppet@production] P:idm switch idm2001 to Debian package

https://gerrit.wikimedia.org/r/957669

Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1001 for host idm2001.wikimedia.org with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1001 for host idm2001.wikimedia.org with OS bookworm completed:

  • idm2001 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run failed and logged in /var/log/spicerack/sre/hosts/reimage/202309190829_slyngshede_3921709_idm2001.out, asking the operator what to do
    • First Puppet run failed and logged in /var/log/spicerack/sre/hosts/reimage/202309190831_slyngshede_3921709_idm2001.out, asking the operator what to do
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202309190845_slyngshede_3921709_idm2001.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Change 957674 merged by Slyngshede:

[operations/dns@master] IDM Switchover

https://gerrit.wikimedia.org/r/957674

Change 957676 merged by Slyngshede:

[operations/puppet@production] P:IDM: Failover Redis

https://gerrit.wikimedia.org/r/957676

Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1001 for host idm1001.wikimedia.org with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1001 for host idm1001.wikimedia.org with OS bookworm completed:

  • idm1001 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202309200724_slyngshede_4193245_idm1001.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1001 for host idm-test1001.wikimedia.org with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1001 for host idm-test1001.wikimedia.org with OS bookworm completed:

  • idm-test1001 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202309201331_slyngshede_78977_idm-test1001.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB