Page MenuHomePhabricator

Site: eqiad 3 VM request for staging-eqiad kube-apiserver
Closed, ResolvedPublic

Description

Cloud VPS Project Tested:
Site/Location: eqiad
Number of systems: 3
Service: kube-apiserver and etcd
Networking Requirements: internal
Processor Requirements: 4
Memory: 5GB
Disks: 30GB
Other Requirements: No DRBD

These will replace the 5 VMs we currently use as etcd and kube-apiservers for staging-eqiad:

  • kubestagetcd1*
  • kubestagemaster100[12]

Event Timeline

Change #1030996 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] Add kubestagemaster100[345]

https://gerrit.wikimedia.org/r/1030996

Change #1030996 merged by JMeybohm:

[operations/puppet@production] Add kubestagemaster100[345]

https://gerrit.wikimedia.org/r/1030996

Cookbook cookbooks.sre.hosts.reimage was started by jayme@cumin1002 for host kubestagemaster1003.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage was started by jayme@cumin1002 for host kubestagemaster1004.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage was started by jayme@cumin1002 for host kubestagemaster1005.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by jayme@cumin1002 for host kubestagemaster1003.eqiad.wmnet with OS bullseye completed:

  • kubestagemaster1003 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Set boot media to disk
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405140904_jayme_4192355_kubestagemaster1003.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by jayme@cumin1002 for host kubestagemaster1004.eqiad.wmnet with OS bullseye completed:

  • kubestagemaster1004 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Set boot media to disk
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405140906_jayme_4192644_kubestagemaster1004.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by jayme@cumin1002 for host kubestagemaster1005.eqiad.wmnet with OS bullseye completed:

  • kubestagemaster1005 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Set boot media to disk
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405140931_jayme_4192981_kubestagemaster1005.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

VM kubestagemaster1003.eqiad.wmnet switching disk type to plain

VM kubestagemaster1004.eqiad.wmnet switching disk type to plain

VM kubestagemaster1005.eqiad.wmnet switching disk type to plain