Page MenuHomePhabricator

(Need By: TBD) rack/setup/install ganeti202[56]
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of ganeti202[56]

Hostname / Racking / Installation Details

Hostnames: ganeti202[56]
Racking Proposal: Ideally they share a rack/row with a minimum number of other ganeti hosts. Current Ganeti host breakdown in codfw: A5:4, B1:2, B5:2 C1:2, C5:2, C6:2, D1:1, D3:1, D5:1, D8:1. Avoid any rack that already has 2 hosts or more, ideally they share with none or just 1 other ganeti host. Ideally they'd end up in row A and B if capacity allows.
Networking/Subnet/VLAN/IP: Single 1G connection for production, but has more complex networking setup (see ganeti2024)
Partitioning/Raid: no hw raid, partman/custom/ganeti-raid5.cfg
OS Distro: Stretch

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

ganeti2025: rack D1 U8 ge-1/0/20

  • - receive in system on procurement task T279174 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac, bios, network, raid controller)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

ganeti2025: rack D6 U7 ge-6/0/6

  • - receive in system on procurement task T279174 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac, bios, network, raid controller)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Event Timeline

RobH added a parent task: Unknown Object (Task).May 11 2021, 6:45 PM

@MoritzMuehlenhoff,

You approved the quote/spec for this, but we didn't get updated racking details on the procurement request T279174, so we'll need to confirm them here. I filled out the racking details with information generated from past experience on these, but I would appreciate a reality check on the details in the task description. Once they are all correct, please reassign this task to @Papaul.

Thanks!

RobH mentioned this in Unknown Object (Task).May 11 2021, 6:47 PM
RobH reassigned this task from Papaul to Jclark-ctr.
RobH reassigned this task from Jclark-ctr to Papaul.
RobH added a subscriber: MoritzMuehlenhoff.
RobH added a subscriber: Jclark-ctr.
RobH removed a subscriber: RobH.
RobH added a subscriber: RobH.
RobH removed a subscriber: RobH.
This comment was removed by Papaul.

Change 700678 had a related patch set uploaded (by Papaul; author: Papaul):

[operations/puppet@production] DHCP, Site.pp: Add ganeti202[56] to site.pp and it's MAC address

https://gerrit.wikimedia.org/r/700678

Change 700678 merged by Papaul:

[operations/puppet@production] DHCP, Site.pp: Add ganeti202[56] to site.pp and it's MAC address

https://gerrit.wikimedia.org/r/700678

Script wmf-auto-reimage was launched by pt1979 on cumin2002.codfw.wmnet for hosts:

ganeti2025.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202106212103_pt1979_2224285_ganeti2025_codfw_wmnet.log.

Completed auto-reimage of hosts:

['ganeti2025.codfw.wmnet']

Of which those FAILED:

['ganeti2025.codfw.wmnet']

Change 700720 had a related patch set uploaded (by Papaul; author: Papaul):

[operations/puppet@production] Add ganeti202[56] to partman

https://gerrit.wikimedia.org/r/700720

Change 700720 merged by Papaul:

[operations/puppet@production] Add ganeti202[56] to partman

https://gerrit.wikimedia.org/r/700720

Script wmf-auto-reimage was launched by pt1979 on cumin2002.codfw.wmnet for hosts:

ganeti2025.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202106220017_pt1979_2246312_ganeti2025_codfw_wmnet.log.

Completed auto-reimage of hosts:

['ganeti2025.codfw.wmnet']

and were ALL successful.

Script wmf-auto-reimage was launched by pt1979 on cumin2002.codfw.wmnet for hosts:

ganeti2026.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202106220046_pt1979_2250535_ganeti2026_codfw_wmnet.log.

Completed auto-reimage of hosts:

['ganeti2026.codfw.wmnet']

and were ALL successful.

Papaul updated the task description. (Show Details)

@MoritzMuehlenhoff this is ready for service.