Page MenuHomePhabricator

Q2:rack/setup/install cephosd100[1-5]
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of cephosd100[1-5]

Hostname / Racking / Installation Details

Hostnames: cephosd100[1-5]
Racking Proposal: 3 in eqiad rows E 2 in eqiad row F
Networking Setup: 1 connection, Speed:10G - although the NIC and switch are capable of either 10 Gbps or 25 Gbps, we have decided to limit this to 10 Gbps by some means to begin with. Vlan: Private AAAA records:Y
Partitioning/Raid: HW Raid: N, Custom partman recipe will be created for this host, due to complex configuration
OS Distro: Bullseye
Sub-team Technical Contact: Ben Tullis - Data Engineering

In T311869#8379710, @BTullis wrote:

When you make the racking/setup ticket from this, could you please bear in mind that I'll need to do some work on the partman recipe before the hosts get imaged for the first time please?

Since we're using the HBA instead of a RAID card and we have so many different storage devices installed, I'd like to log into the iDRAC and check it over to see in what order everything has been discovered before writing the recipe.
Perhaps if you could get them to the point of being cabled, the management interfaces configured, firmware updated etc. and then let me know; I'll then be able to check out one of them in detail and write the custom partman recipe for them.

Per host setup checklist

cephosd1001:
  • - receive in system on procurement task <enter task # here> & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer from an active cumin host to commit
  • - bios/drac/serial setup/testing, see Lifecycle Steps & Automatic BIOS setup details
  • - firmware update (idrac, bios, network, raid controller)
  • - do not image, merely ensure imaging will run (network works, bios works, etc) and handoff to @BTullis to complete installation
cephosd1002:
  • - receive in system on procurement task <enter task # here> & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer from an active cumin host to commit
  • - bios/drac/serial setup/testing, see Lifecycle Steps & Automatic BIOS setup details
  • - firmware update (idrac, bios, network, raid controller)
  • - do not image, merely ensure imaging will run (network works, bios works, etc) and handoff to @BTullis to complete installation
cephosd1003:
  • - receive in system on procurement task <enter task # here> & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer from an active cumin host to commit
  • - bios/drac/serial setup/testing, see Lifecycle Steps & Automatic BIOS setup details
  • - firmware update (idrac, bios, network, raid controller)
  • - do not image, merely ensure imaging will run (network works, bios works, etc) and handoff to @BTullis to complete installation
cephosd1004:
  • - receive in system on procurement task <enter task # here> & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer from an active cumin host to commit
  • - bios/drac/serial setup/testing, see Lifecycle Steps & Automatic BIOS setup details
  • - firmware update (idrac, bios, network, raid controller)
  • - do not image, merely ensure imaging will run (network works, bios works, etc) and handoff to @BTullis to complete installation
cephosd1005:
  • - receive in system on procurement task <enter task # here> & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer from an active cumin host to commit
  • - bios/drac/serial setup/testing, see Lifecycle Steps & Automatic BIOS setup details
  • - firmware update (idrac, bios, network, raid controller)
  • - do not image, merely ensure imaging will run (network works, bios works, etc) and handoff to @BTullis to complete installation

Related Objects

StatusSubtypeAssignedTask
DuplicateNone
DuplicateNone
Resolved EChetty
Resolved Cmjohnson

Event Timeline

RobH added a parent task: Unknown Object (Task).
RobH moved this task from Backlog to Racking Tasks on the ops-eqiad board.
RobH mentioned this in Unknown Object (Task).

cephosd1001 E1 U3 Port 3 Cableid# 20220225
cephosd1002 E2 U3 Port 3 Cableid# 20220237
cephosd1003 E3 U3 Port 3 Cableid# 20220238
cephosd1004 F1 U3 Port 3 Cableid# 20220236
cephosd1005 F2 U3 Port 3 Cableid# 20220235

installed 6.4tb nvme into servers

Change 865186 had a related patch set uploaded (by Cmjohnson; author: Cmjohnson):

[operations/puppet@production] Adding cephosd servers to site.pp insetup role

https://gerrit.wikimedia.org/r/865186

Change 865186 merged by Cmjohnson:

[operations/puppet@production] Adding cephosd servers to site.pp insetup role

https://gerrit.wikimedia.org/r/865186

Cmjohnson updated the task description. (Show Details)

@BTullis these servers are ready for you to image. BIOS/Network and firmware have been updated. I updated site.pp as well