Page MenuHomePhabricator

(Need By: TBD) rack/setup/install drmrs non-cp-hosts
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of 9 hosts in drmrs, all using the same hardware configuration for differing roles:

  • dns600[12]
  • ganeti600[1234]
  • lvs600[123]

Hostname / Racking / Installation Details

Use the master hardware tracking google sheet for drmrs, listing the racking elevations for each host. (This is shared directly with those who need it.)

Networking Details:
all hosts except lvs get a single production network connection
lvs hosts have to have their primary connection to the switch in their rack, and a secondary connection to the switch in the adjacent rack.

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

dns6001:

  • - receive in system on procurement task T281501 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac & bios updated, network already newest)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

dns6002:

  • - receive in system on procurement task T281501 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac & bios updated, network already newest)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

ganeti6001:

  • - receive in system on procurement task T281501 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac & bios updated, network already newest)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host

ganeti6002:

  • - receive in system on procurement task T281501 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac & bios updated, network already newest)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host

ganeti6003:

  • - receive in system on procurement task T281501 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac & bios updated, network already newest)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host

ganeti6004:

  • - receive in system on procurement task T281501 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac & bios updated, network already newest)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host

lvs6001:

  • - receive in system on procurement task T281501 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac & bios updated, network already newest)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host

lvs6002:

  • - receive in system on procurement task T281501 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac & bios updated, network already newest)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host

lvs6003:

  • - receive in system on procurement task T281501 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware update (idrac & bios updated, network already newest)
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Related Objects

StatusSubtypeAssignedTask
ResolvedRobH

Event Timeline

RobH added a parent task: Unknown Object (Task).Jul 12 2021, 6:46 PM
RobH moved this task from Backlog to Racking Tasks on the ops-drmrs board.
RobH added a subscriber: MMandere.

@MMandere is handling these installations, so I'm reassigning this to him.

Once these hosts are installed and calling into puppet, please update and resolve this task.

RobH changed the task status from Open to In Progress.EditedDec 3 2021, 6:41 PM
RobH claimed this task.

stealing to update firmware on these hosts, with the following notes:

  • dns600[12] are the only hosts in service. ensure updates happen to these one at a time, with clean shutdown of the OS and a full return to service before working on the next dns host.
  • ganeti600[1234] have the OS installed but are not in service
  • bast6001 has not been imaged as that name, but is ganeti6004
  • lvs600[123] staged, not in service.

Mentioned in SAL (#wikimedia-sre) [2021-12-13T14:54:47Z] <robh> dns6002 rebooting for firmware updates via T286507

Mentioned in SAL (#wikimedia-sre) [2021-12-13T15:15:28Z] <robh> dns6002 bios update done, returned to green in icinga, dns6001 coming down next for firmware update via T286507

Mentioned in SAL (#wikimedia-sre) [2021-12-13T15:25:06Z] <robh> dns6001 returned to service (icinga checks going green) via T286507

RobH updated the task description. (Show Details)