Page MenuHomePhabricator

(Need by Aug 1) rack/setup/install dumpsdata1003.eqiad.wmnet
Closed, ResolvedPublic

Description

This task will track the racking/setup/installation of dumpsdata1003.eqiad.wmnet. This is an expansion of the service cluster, so no servers will be decommissioned as a result of this deployment.

Hostnames: dumpsdata1003
Racking Proposal: Avoid sharing a rack/row with existing, so place in row A or row C (as dumpsdata1001 is in B8 and dumpsdata1002 is in D1.
Networking/Subnet/IP: internal private subnet for the row, no 10G required at this time.
Partitioning/Raid: Setup identical to existing dumpsdata systems, OS on dual small disks and large raid10 array for the rest.

dumpsdata1003:

  • - receive in system on procurement task T226540
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned) - seems this is already in netbox on https://netbox.wikimedia.org/dcim/devices/2255/ but needs the racking location updated when it is racked.
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged
  • - handoff for service implementation
  • - service implementer changes from 'staged' status to 'active' status in netbox'

Event Timeline

RobH triaged this task as Medium priority.Sep 27 2019, 5:45 PM
RobH created this task.
RobH added a parent task: Unknown Object (Task).Sep 27 2019, 5:45 PM
RobH moved this task from Backlog to Racking Tasks on the ops-eqiad board.
RobH updated the task description. (Show Details)

I'd like to request that both eth interfaces be cabled, as I'd like to try to set up bonding for this host.

racked and cabled host updated netbox

As far as the base install, I'd like buster on it, and just one interface active. It's more important to get this done soon(and with buster) than to have bonding worked out. The raid setup should look like the raid on the existing dumpsdata servers.

wiki_willy renamed this task from rack/setup/install dumpsdata1003.eqiad.wmnet to (Need by Aug 1) rack/setup/install dumpsdata1003.eqiad.wmnet.Oct 17 2019, 10:54 PM

Updating the Need by Date in the subject line, based on the procurement task. @Cmjohnson - can you provide an ETA on when this can be completed? Thanks, Willy

Change 545268 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Adding mgmt dns for dumpsdata1003

https://gerrit.wikimedia.org/r/545268

Change 545268 merged by Cmjohnson:
[operations/dns@master] Adding mgmt dns for dumpsdata1003

https://gerrit.wikimedia.org/r/545268

Change 545823 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Adding production dns for dumpsdata1003

https://gerrit.wikimedia.org/r/545823

Change 545823 merged by Cmjohnson:
[operations/dns@master] Adding production dns for dumpsdata1003

https://gerrit.wikimedia.org/r/545823

Cmjohnson subscribed.

@ArielGlenn the onsite work has been completed, I did add the production dns

Awesome. Do you need instructions for the raid setup or is that already taken care of?

Reassigning to @Cmjohnson for Ariel's RAID question

Why was this assigned to me (which I didn't even notice)? Doesn't it get handed off after role::spare is put on the box, somewhere around "handoff for service implementation"?

Change 548840 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/puppet@production] Adding dhcp file and netboot cfg file for dumpsdata1003

https://gerrit.wikimedia.org/r/548840

Change 548840 merged by Cmjohnson:
[operations/puppet@production] Adding dhcp file and netboot cfg file for dumpsdata1003

https://gerrit.wikimedia.org/r/548840

Getting this error during initial OS installation

Network autoconfiguration failed │

│ Your network is probably not using the DHCP protocol. Alternatively,  │
│ the DHCP server may be slow or some network hardware is not working   │
│ properly.

@Cmjohnson: According to the DHCP logs on install1002, the server correctly assigned the IP address, but I suspect the error is caused by the OS here; it's current configured for jessie, which probably doesn't support the new hardware. Those are intended to run buster anyway, so please adjust the config to use Buster and retry: https://phabricator.wikimedia.org/T224563#5636773

Cmjohnson removed a project: ops-eqiad.

@ArielGlenn All yours! If you have any issues please add the ops-eqiad tag back. Thanks!

I wonder why this is still open. Woops! Host has been doing its job for quite some time...