Page MenuHomePhabricator

Upgrade x1 to Bullseye
Closed, ResolvedPublic

Description

Let's upgrade x1 to Bullseye.

Event Timeline

Marostegui moved this task from Triage to In progress on the DBA board.

Change 757288 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] mariadb: x1 codfw disable notifications

https://gerrit.wikimedia.org/r/757288

Change 757288 merged by Marostegui:

[operations/puppet@production] mariadb: x1 codfw disable notifications

https://gerrit.wikimedia.org/r/757288

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db2096.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db2115.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db2096.codfw.wmnet with OS bullseye completed:

  • db2096 (WARN)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201260641_marostegui_11389_db2096.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db2131.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db2115.codfw.wmnet with OS bullseye completed:

  • db2115 (WARN)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201260643_marostegui_11573_db2115.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db2131.codfw.wmnet with OS bullseye completed:

  • db2131 (WARN)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201260714_marostegui_19960_db2131.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-01-26T08:57:33Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1120 T300099', diff saved to https://phabricator.wikimedia.org/P19249 and previous config saved to /var/cache/conftool/dbconfig/20220126-085733-marostegui.json

Change 757386 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1120: Disable notifications

https://gerrit.wikimedia.org/r/757386

Change 757386 merged by Marostegui:

[operations/puppet@production] db1120: Disable notifications

https://gerrit.wikimedia.org/r/757386

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db1120.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db1120.eqiad.wmnet with OS bullseye completed:

  • db1120 (PASS)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201260859_marostegui_3666_db1120.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-01-26T11:42:37Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1137 T300099', diff saved to https://phabricator.wikimedia.org/P19286 and previous config saved to /var/cache/conftool/dbconfig/20220126-114236-marostegui.json

Change 757419 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1137: Disable notifications

https://gerrit.wikimedia.org/r/757419

Change 757419 merged by Marostegui:

[operations/puppet@production] db1137: Disable notifications

https://gerrit.wikimedia.org/r/757419

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db1137.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db1137.eqiad.wmnet with OS bullseye completed:

  • db1137 (PASS)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201261144_marostegui_31461_db1137.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

x1 is now pending a master switchover: T300472

Change 809878 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1103: Disable notifications

https://gerrit.wikimedia.org/r/809878

Change 809878 merged by Marostegui:

[operations/puppet@production] db1103: Disable notifications

https://gerrit.wikimedia.org/r/809878

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db1103.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db1103.eqiad.wmnet with OS bullseye completed:

  • db1103 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202206300625_marostegui_940723_db1103.out
    • Checked BIOS boot parameters are back to normal
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Marostegui updated the task description. (Show Details)

db1103 reimaged to Bullseye and host being repooled.