Page MenuHomePhabricator

Upgrade es2 to Bullseye
Closed, ResolvedPublic

Description

Let's upgrade es2 to Bullseye.

  • es2033
  • es2031
  • es2026
  • es1033
  • es1030
  • es1026

Event Timeline

Marostegui moved this task from Triage to In progress on the DBA board.

Change 756537 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] es2026,2031,2033: Disable notifications

https://gerrit.wikimedia.org/r/756537

Change 756537 merged by Marostegui:

[operations/puppet@production] es2026,2031,2033: Disable notifications

https://gerrit.wikimedia.org/r/756537

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host es2031.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host es2033.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host es2031.codfw.wmnet with OS bullseye executed with errors:

  • es2031 (FAIL)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host es2031.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host es2033.codfw.wmnet with OS bullseye completed:

  • es2033 (PASS)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201241035_marostegui_27897_es2033.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host es2031.codfw.wmnet with OS bullseye completed:

  • es2031 (WARN)
    • Downtimed on Icinga
    • Unable to disable Puppet, the host may have been unreachable
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201241053_marostegui_10654_es2031.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host es2026.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host es2026.codfw.wmnet with OS bullseye completed:

  • es2026 (PASS)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201241137_marostegui_23902_es2026.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-01-24T12:10:34Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es1033 T299889', diff saved to https://phabricator.wikimedia.org/P19042 and previous config saved to /var/cache/conftool/dbconfig/20220124-121029-marostegui.json

Change 756550 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] es1033: Disable notifications

https://gerrit.wikimedia.org/r/756550

Change 756550 merged by Marostegui:

[operations/puppet@production] es1033: Disable notifications

https://gerrit.wikimedia.org/r/756550

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host es1033.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host es1033.eqiad.wmnet with OS bullseye completed:

  • es1033 (PASS)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201241213_marostegui_25705_es1033.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-01-25T06:02:41Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es1030 T299889', diff saved to https://phabricator.wikimedia.org/P19079 and previous config saved to /var/cache/conftool/dbconfig/20220125-060241-marostegui.json

Change 756871 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] es1030: Disable notifications

https://gerrit.wikimedia.org/r/756871

Change 756871 merged by Marostegui:

[operations/puppet@production] es1030: Disable notifications

https://gerrit.wikimedia.org/r/756871

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host es1030.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host es1030.eqiad.wmnet with OS bullseye completed:

  • es1030 (PASS)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201250607_marostegui_3851_es1030.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-01-25T13:16:22Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Promote es1030 to es2 master T299889', diff saved to https://phabricator.wikimedia.org/P19146 and previous config saved to /var/cache/conftool/dbconfig/20220125-131622-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2022-01-25T13:17:28Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es1026 T299889', diff saved to https://phabricator.wikimedia.org/P19147 and previous config saved to /var/cache/conftool/dbconfig/20220125-131727-marostegui.json

Change 756976 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] es1026: Disable notifications

https://gerrit.wikimedia.org/r/756976

Change 756976 merged by Marostegui:

[operations/puppet@production] es1026: Disable notifications

https://gerrit.wikimedia.org/r/756976

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host es1026.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host es1026.eqiad.wmnet with OS bullseye completed:

  • es1026 (PASS)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201251343_marostegui_23973_es1026.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Marostegui updated the task description. (Show Details)

es2 is now running Bullseye

===== NODE GROUP =====
(6) es[2026,2031,2033].codfw.wmnet,es[1026,1030,1033].eqiad.wmnet
----- OUTPUT of 'lsb_release -a' -----
Distributor ID: Debian
Description:    Debian GNU/Linux 11 (bullseye)
Release:	11
Codename:	bullseye
No LSB modules are available.