Page MenuHomePhabricator

Q3:(Need By: TBD) rack/setup/install restbase2027
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of restbase2027

Hostname / Racking / Installation Details

Hostnames: restbase2027
Racking Proposal: row d
Networking/Subnet/VLAN/IP: match other restbase, (1) 1g private1 vlan - use restbase networking config in netbox - 3 Cassandra instances
Partitioning/Raid: match other restbase (cassandrahosts-3ssd-jbod)
OS Distro: buster

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

restbase2027: D5/U10 ge-5/0/9
  • - receive in system on procurement task T300287 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer from an active cumin host to commit
  • - bios/drac/serial setup/testing, see Lifecycle Steps & Automatic BIOS setup details
  • - firmware update (idrac, bios, network, raid controller)
  • - operations/puppet update - this should include updates to netboot.pp, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via sre.hosts.reimage cookbook.

Related Objects

StatusSubtypeAssignedTask
ResolvedPapaul

Event Timeline

RobH mentioned this in Unknown Object (Task).
RobH added a parent task: Unknown Object (Task).
RobH unsubscribed.

Change 773651 had a related patch set uploaded (by Papaul; author: Papaul):

[operations/puppet@production] Add restbase2027 to site.pp and netboot.cfg

https://gerrit.wikimedia.org/r/773651

Change 773651 merged by Papaul:

[operations/puppet@production] Add restbase2027 to site.pp and netboot.cfg

https://gerrit.wikimedia.org/r/773651

Cookbook cookbooks.sre.hosts.reimage was started by pt1979@cumin2002 for host restbase2027.codfw.wmnet with OS buster

getting the message below during install

reuse-parts: Recipe device matching failed │
                 │ ERROR: =dev=md0 matches zero devices       │
                 │                                            │
                 │ All devices:                               │
                 │ =dev=sda                                   │
                 │ =dev=sdb                                   │
                 │ =dev=sdc                                   │
                 │                                            │
                 │     <Go Back>               <Continue>

Cookbook cookbooks.sre.hosts.reimage started by pt1979@cumin2002 for host restbase2027.codfw.wmnet with OS buster executed with errors:

  • restbase2027 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by pt1979@cumin2002 for host restbase2027.codfw.wmnet with OS buster

Papaul subscribed.

@hnowlan can you please check the partman recipe when done assign the task back to me.

Thanks

Cookbook cookbooks.sre.hosts.reimage started by pt1979@cumin2002 for host restbase2027.codfw.wmnet with OS buster executed with errors:

  • restbase2027 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • The reimage failed, see the cookbook logs for the details

For new hosts, it seems the reuse profile won't work as it expects an existing array. The non-reuse cassandrahosts-3ssd-jbod config is required.

Change 779494 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/puppet@production] install_server: use non-reuse partition for new host

https://gerrit.wikimedia.org/r/779494

Change 779494 merged by Hnowlan:

[operations/puppet@production] install_server: use non-reuse partition for new host

https://gerrit.wikimedia.org/r/779494

@hnowlan
" use non-reuse partition for new host"
so if you want to re-image this host later down the road you will have to change it again in netboot.cfg to use reuse-parts.cfg ?

@hnowlan
" use non-reuse partition for new host"
so if you want to re-image this host later down the road you will have to change it again in netboot.cfg to use reuse-parts.cfg ?

Yep - once the imaging is complete I'll change it to the reuse profile like the other existing hosts and it'll be ready to reimage for future upgrades.

Cookbook cookbooks.sre.hosts.reimage was started by pt1979@cumin2002 for host restbase2027.codfw.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by pt1979@cumin2002 for host restbase2027.codfw.wmnet with OS buster completed:

  • restbase2027 (PASS)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204121630_pt1979_2141565_restbase2027.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> staged
Papaul updated the task description. (Show Details)

@hnowlan complete