Page MenuHomePhabricator

(Need By: ASAP) rack/setup/install ms-be20[62-65]
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of ms-be20[62-65]

Please note these were originally needed by end of July, but a long lead time of chipsets for network cards has resulted in a 60+ day leadtime. As soon as these arrive, they should be racked with priority as they will be pushed into service by @fgiunchedi once they are online.

Hostname / Racking / Installation Details

Hostnames: ms-be20[62-65]
Racking Proposal: One host per row
Networking/Subnet/VLAN/IP: 10G private VLAN
Partitioning/Raid: Same as existing ms-be
OS Distro: Stretch

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

ms-be2062: Rack A3 xe-4/0/2

  • - receive in system on procurement task T284952 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac, bios, network, raid controller)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

ms-be2063: Rack B4 xe-4/0/4

  • - receive in system on procurement task T284952 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac, bios, network, raid controller)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

ms-be2064: Rack C4 xe-4/0/4

  • - receive in system on procurement task T284952 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac, bios, network, raid controller)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

ms-be2065: Rack D2 xe-2/0/2

  • - receive in system on procurement task T284952 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac, bios, network, raid controller)

[xx] - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
{x] - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host

  • - host state in netbox set to staged

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Event Timeline

RobH added a parent task: Unknown Object (Task).
RobH mentioned this in Unknown Object (Task).Jun 29 2021, 9:02 PM

Change 710352 had a related patch set uploaded (by Papaul; author: Papaul):

[operations/puppet@production] Add ms-be206[2345] to DHCP file and site.pp

https://gerrit.wikimedia.org/r/710352

Change 710352 merged by Papaul:

[operations/puppet@production] Add ms-be206[2345] to DHCP file and site.pp

https://gerrit.wikimedia.org/r/710352

Script wmf-auto-reimage was launched by pt1979 on cumin2002.codfw.wmnet for hosts:

ms-be2062.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202108052031_pt1979_1089453_ms-be2062_codfw_wmnet.log.

Change 710362 had a related patch set uploaded (by Papaul; author: Papaul):

[operations/puppet@production] Remove role insetup for ms-be206[2345] since there is already a role for ms-be* nodes

https://gerrit.wikimedia.org/r/710362

Change 710362 merged by Papaul:

[operations/puppet@production] Remove role insetup for ms-be206[2345] from site.pp

https://gerrit.wikimedia.org/r/710362

Completed auto-reimage of hosts:

['ms-be2062.codfw.wmnet']

Of which those FAILED:

['ms-be2062.codfw.wmnet']

Script wmf-auto-reimage was launched by pt1979 on cumin2002.codfw.wmnet for hosts:

ms-be2062.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202108052203_pt1979_1100973_ms-be2062_codfw_wmnet.log.

Completed auto-reimage of hosts:

['ms-be2062.codfw.wmnet']

Of which those FAILED:

['ms-be2062.codfw.wmnet']

Script wmf-auto-reimage was launched by pt1979 on cumin2002.codfw.wmnet for hosts:

ms-be2062.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202108052241_pt1979_1106515_ms-be2062_codfw_wmnet.log.

Completed auto-reimage of hosts:

['ms-be2062.codfw.wmnet']

and were ALL successful.

Script wmf-auto-reimage was launched by pt1979 on cumin2002.codfw.wmnet for hosts:

ms-be2063.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202108052321_pt1979_1111831_ms-be2063_codfw_wmnet.log.

Completed auto-reimage of hosts:

['ms-be2063.codfw.wmnet']

and were ALL successful.

Script wmf-auto-reimage was launched by pt1979 on cumin2002.codfw.wmnet for hosts:

ms-be2064.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202108052356_pt1979_1116791_ms-be2064_codfw_wmnet.log.

Completed auto-reimage of hosts:

['ms-be2064.codfw.wmnet']

and were ALL successful.

Script wmf-auto-reimage was launched by pt1979 on cumin2002.codfw.wmnet for hosts:

ms-be2065.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202108060043_pt1979_1124119_ms-be2065_codfw_wmnet.log.

Completed auto-reimage of hosts:

['ms-be2065.codfw.wmnet']

and were ALL successful.

Papaul updated the task description. (Show Details)

@fgiunchedi this is complete

@Papaul thank you so much for the speedy action on this!