Page MenuHomePhabricator

(Need By: TBD) rack/setup/install ganeti102[34]
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of ganeti102[34]

Hostname / Racking / Installation Details

Hostnames: ganeti1023, ganeti1024
Racking Proposal: Can you add one to row A and one to row C?
Networking/Subnet/VLAN/IP: 1G
Partitioning/Raid: ganeti-raid5.cfg
OS Distro: Stretch

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

ganeti1023:

  • - receive in system on procurement task T279173 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac, bios, network, raid controller)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

ganeti1024:

  • - receive in system on procurement task T279173 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac, bios, network, raid controller)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Related Objects

StatusSubtypeAssignedTask
Resolved Cmjohnson

Event Timeline

RobH created this task.
RobH moved this task from Backlog to Racking Tasks on the ops-eqiad board.
RobH mentioned this in Unknown Object (Task).
RobH added a parent task: Unknown Object (Task).
RobH added a subscriber: MoritzMuehlenhoff.
RobH unsubscribed.

ganeti1023 A8. u8 port21 cableId#23000024
ganeti1024 C5 u27 port33 cableId#1963

IDRACs setup, the on-site work is complete.

firmware updated, BIOS was current but iDRAC needed updating. Changed root password.

Change 711183 had a related patch set uploaded (by Cmjohnson; author: Cmjohnson):

[operations/puppet@production] ganetia1023-24 setup netboot.cfg, dhcpd file, site.pp

https://gerrit.wikimedia.org/r/711183

Change 711183 merged by Cmjohnson:

[operations/puppet@production] ganetia1023-24 setup netboot.cfg, dhcpd file, site.pp

https://gerrit.wikimedia.org/r/711183

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

ganeti1023.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202108101850_cmjohnson_16170_ganeti1023_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

ganeti1024.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202108101851_cmjohnson_16243_ganeti1024_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['ganeti1024.eqiad.wmnet']

and were ALL successful.

Completed auto-reimage of hosts:

['ganeti1023.eqiad.wmnet']

and were ALL successful.

Cmjohnson updated the task description. (Show Details)

All tasks completed

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host ganeti1023.eqiad.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host ganeti1023.eqiad.wmnet with OS buster completed:

  • ganeti1023 (PASS)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201141117_jmm_2215208_ganeti1023.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host ganeti1024.eqiad.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host ganeti1024.eqiad.wmnet with OS buster completed:

  • ganeti1024 (PASS)
    • Downtimed on Icinga
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202201141222_jmm_2223552_ganeti1024.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Change 755346 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Make ganeti1023 a Ganeti node

https://gerrit.wikimedia.org/r/755346

Change 755346 merged by Muehlenhoff:

[operations/puppet@production] Make ganeti1023 a Ganeti node

https://gerrit.wikimedia.org/r/755346

Change 755440 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Make ganeti1024 a Ganeti node

https://gerrit.wikimedia.org/r/755440

Change 755440 merged by Muehlenhoff:

[operations/puppet@production] Make ganeti1024 a Ganeti node

https://gerrit.wikimedia.org/r/755440

Mentioned in SAL (#wikimedia-operations) [2022-01-20T11:49:50Z] <moritzm> add ganeti1024 to Ganeti eqiad cluster T283036

Mentioned in SAL (#wikimedia-operations) [2022-01-20T14:03:36Z] <moritzm> enabled hardware virtualisation in BIOS for ganeti1024 T283036

Mentioned in SAL (#wikimedia-operations) [2022-01-20T14:20:53Z] <moritzm> enabled hardware virtualisation in BIOS for ganeti1023 T283036