Page MenuHomePhabricator

Migrate es1 section to Debian Trixie
Closed, ResolvedPublic

Description

  • es2055
  • es2053
  • es2051
  • es1055
  • es1052
  • es1050

Details

Related Changes in Gerrit:

Event Timeline

Completed depooling of es2055 by marostegui@cumin1003: Upgrading es2055.codfw.wmnet

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host es2055.codfw.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host es2055.codfw.wmnet with OS trixie completed:

  • es2055 (WARN)
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605270604_marostegui_972444_es2055.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Skipping waiting for Icinga optimal status and not removing the downtime, --no-check-icinga was set
    • Updated Netbox data from PuppetDB

Completed depooling of es2051 by marostegui@cumin1003: Upgrading es2051.codfw.wmnet

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host es2051.codfw.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host es2051.codfw.wmnet with OS trixie completed:

  • es2051 (WARN)
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605270818_marostegui_992610_es2051.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Skipping waiting for Icinga optimal status and not removing the downtime, --no-check-icinga was set
    • Updated Netbox data from PuppetDB

Completed depooling of es1050 by marostegui@cumin1003: Upgrading es1050.eqiad.wmnet

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host es1050.eqiad.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host es1050.eqiad.wmnet with OS trixie completed:

  • es1050 (WARN)
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605270946_marostegui_1071117_es1050.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Skipping waiting for Icinga optimal status and not removing the downtime, --no-check-icinga was set
    • Updated Netbox data from PuppetDB

Completed depooling of es1055 by marostegui@cumin1003: Upgrading es1055.eqiad.wmnet

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host es1055.eqiad.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host es1055.eqiad.wmnet with OS trixie completed:

  • es1055 (WARN)
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202606010958_marostegui_3573007_es1055.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Skipping waiting for Icinga optimal status and not removing the downtime, --no-check-icinga was set
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2026-06-01T10:47:39Z] <marostegui@cumin1003> dbctl commit (dc=all): 'Promote es2055 to es1 codfw primary T427032', diff saved to https://phabricator.wikimedia.org/P93424 and previous config saved to /var/cache/conftool/dbconfig/20260601-104739-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2026-06-01T10:48:38Z] <marostegui@cumin1003> dbctl commit (dc=all): 'Promote es1050 to es1 eqiad primary T427032', diff saved to https://phabricator.wikimedia.org/P93425 and previous config saved to /var/cache/conftool/dbconfig/20260601-104837-marostegui.json

Change #1295872 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/dns@master] wmnet: Update es1-master CNAME

https://gerrit.wikimedia.org/r/1295872

Change #1295872 merged by Marostegui:

[operations/dns@master] wmnet: Update es1-master CNAME

https://gerrit.wikimedia.org/r/1295872

Completed depooling of es1052 by marostegui@cumin1003: Upgrading es1052.eqiad.wmnet

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host es1052.eqiad.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host es1052.eqiad.wmnet with OS trixie completed:

  • es1052 (WARN)
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202606020529_marostegui_3861657_es1052.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Skipping waiting for Icinga optimal status and not removing the downtime, --no-check-icinga was set
    • Updated Netbox data from PuppetDB

Completed depooling of es2053 by marostegui@cumin1003: Upgrading es2053.codfw.wmnet

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host es2053.codfw.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host es2053.codfw.wmnet with OS trixie completed:

  • es2053 (WARN)
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202606020704_marostegui_3876624_es2053.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Skipping waiting for Icinga optimal status and not removing the downtime, --no-check-icinga was set
    • Updated Netbox data from PuppetDB
Marostegui updated the task description. (Show Details)

All done