Page MenuHomePhabricator

Migrate SSH bastions to Bullseye
Closed, ResolvedPublic

Description

Migrations of the bastions to Bullseye:

  • bast1003 (reimaged)
  • bast2002 (reimaged)
  • bast3005 (replaced by new bast3006 VM)
  • bast4003 (replaced by new bast4004 VM)
  • bast5002 (replaced by new bast5003 VM)
  • bast6001 (replaced by new bast6002 VM)

Event Timeline

MoritzMuehlenhoff triaged this task as Medium priority.

Change 878945 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add bast3006/bast4004/bast6002 to Puppet

https://gerrit.wikimedia.org/r/878945

Change 878945 merged by Muehlenhoff:

[operations/puppet@production] Add bast3006/bast4004/bast6002 to Puppet

https://gerrit.wikimedia.org/r/878945

Change 879273 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add bast5003

https://gerrit.wikimedia.org/r/879273

Change 879273 merged by Muehlenhoff:

[operations/puppet@production] Add bast5003

https://gerrit.wikimedia.org/r/879273

Change 880433 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add new bastions

https://gerrit.wikimedia.org/r/880433

Change 880477 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add new bastions in esams/eqsin/drmrs

https://gerrit.wikimedia.org/r/880477

Change 880477 merged by Muehlenhoff:

[operations/puppet@production] Add new bastions in esams/eqsin/drmrs

https://gerrit.wikimedia.org/r/880477

Change 880433 merged by Muehlenhoff:

[operations/puppet@production] Add new bastions

https://gerrit.wikimedia.org/r/880433

I updated https://wikitech.wikimedia.org/wiki/Template:BastionMap and created pages for the new bastions, but as discussed on ops@lists.wikimedia.org, the ssh fingerprints are missing.

I updated https://wikitech.wikimedia.org/wiki/Template:BastionMap and created pages for the new bastions, but as discussed on ops@lists.wikimedia.org, the ssh fingerprints are missing.

Thanks for that! I've backfilled the fingerprint wikitech entries now.

Change 884225 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Disable old bastions

https://gerrit.wikimedia.org/r/884225

Change 884225 merged by Muehlenhoff:

[operations/puppet@production] Disable old bastions

https://gerrit.wikimedia.org/r/884225

Change 884832 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove previos bastions from bastion_host list

https://gerrit.wikimedia.org/r/884832

Change 884832 merged by Muehlenhoff:

[operations/puppet@production] Remove previos bastions from bastion_host list

https://gerrit.wikimedia.org/r/884832

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: bast4003.wikimedia.org

  • bast4003.wikimedia.org (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ulsfo to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster ulsfo to Netbox

Change 884845 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove bast4003

https://gerrit.wikimedia.org/r/884845

Change 884845 merged by Muehlenhoff:

[operations/puppet@production] Remove bast4003

https://gerrit.wikimedia.org/r/884845

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: bast6001.wikimedia.org

  • bast6001.wikimedia.org (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster drmrs01 to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster drmrs01 to Netbox

Change 885810 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove bast3005/bast5002 from Puppet

https://gerrit.wikimedia.org/r/885810

Change 885810 merged by Muehlenhoff:

[operations/puppet@production] Remove bast3005/bast5002 from Puppet

https://gerrit.wikimedia.org/r/885810

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: bast5002.wikimedia.org

  • bast5002.wikimedia.org (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster eqsin to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster eqsin to Netbox

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: bast3005.wikimedia.org

  • bast3005.wikimedia.org (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster esams to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster esams to Netbox

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host bast2002.wikimedia.org with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host bast2002.wikimedia.org with OS bullseye completed:

  • bast2002 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202302070902_jmm_408503_bast2002.out
    • Checked BIOS boot parameters are back to normal
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host bast1003.wikimedia.org with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host bast1003.wikimedia.org with OS bullseye completed:

  • bast1003 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202302070944_jmm_416713_bast1003.out
    • Checked BIOS boot parameters are back to normal
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
MoritzMuehlenhoff updated the task description. (Show Details)

This is complete