Page MenuHomePhabricator

(Need By: TBD) setup/install deploy1002
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of deploy1002. This host was on the budget for this year, but instead was allocated from spare pool system

Hostname / Racking / Installation Details

This is already racked, so it will only need minimal updates from on-site.

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

deploy1002/wmf5177:

  • - ops-eqiad apply hostname labels to fron/back of host
  • - bios/drac/serial setup/testing/firmware updates (this was outdated with old ilom user info and firmwares)
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run
  • - host state in netbox set to staged

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
Resolved toan
ResolvedLucas_Werkmeister_WMDE
ResolvedJoe
ResolvedJdforrester-WMF
ResolvedLadsgroup
InvalidNone
ResolvedReedy
OpenNone
Resolvedtstarling
ResolvedJdforrester-WMF
StalledNone
ResolvedNone
ResolvedPRODUCTION ERRORLegoktm
Resolvedtstarling
ResolvedJoe
ResolvedKrinkle
Resolvedhashar
ResolvedJdforrester-WMF
ResolvedDzahn
ResolvedDzahn
Resolved Cmjohnson

Event Timeline

RobH added a parent task: Unknown Object (Task).

Change 634333 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] deploy1002 mac info

https://gerrit.wikimedia.org/r/634333

RobH moved this task from Backlog to Racking Tasks on the ops-eqiad board.
RobH added a subscriber: Cmjohnson.

setup notes:

  • this had the old idrac login info, perhaps due to it being 'inventory' in netbox and not included in ilom user updates months ago?
  • old idrac firmware 3.30.30.30, updated to 4.22.55.53
  • bios firmware was 1.7.0, updated to 2.8.2
  • updated puppet lease file with mac address

pending question before imaging script is run:

Please note even after imaging is done, this task will need to be reassigned to @Cmjohnson so he can apply hostname labels.

@RobH Please leave it on stretch for now. It relies on Mediawiki classes that are not ready for buster just yet.

Change 634333 merged by RobH:
[operations/puppet@production] deploy1002 mac info

https://gerrit.wikimedia.org/r/634333

Change 634583 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] deploy1002 prod dns

https://gerrit.wikimedia.org/r/634583

Change 634583 merged by RobH:
[operations/dns@master] deploy1002 prod dns

https://gerrit.wikimedia.org/r/634583

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

['deploy1002.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202010161924_robh_31297.log.

Completed auto-reimage of hosts:

['deploy1002.eqiad.wmnet']

Of which those FAILED:

['deploy1002.eqiad.wmnet']

Change 634598 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] adding deploy1002 to site.pp

https://gerrit.wikimedia.org/r/634598

Change 634598 merged by RobH:
[operations/puppet@production] adding deploy1002 to site.pp

https://gerrit.wikimedia.org/r/634598

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

['deploy1002.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202010162012_robh_7688.log.

Completed auto-reimage of hosts:

['deploy1002.eqiad.wmnet']

Of which those FAILED:

['deploy1002.eqiad.wmnet']

This fails reimage due to the initial puppet run failing. Not sure if we should apply a different role, or if you want to take over and reimage from here.

Please note that when the image is done, this needs to be reassigned to @Cmjohnson to apply the hostname label.

Volans raised the priority of this task from Medium to High.
Volans subscribed.

The IPs were allocated manually outside of Netbox and as such they could be allocated to a different host by Netbox in any upcoming provisioning causing conflicts.

Please always follow the procedure for provisioning as described in the wikitech pages below (all interinked):

In this specific case I think that if you manage to run successfully the reimage script it will fix Netbox too but I will need to double check the end result to make sure it's all correct.
In any case the data must be fixed ASAP and before any new provisioning in eqiad happen.

Re-assigning to @RobH

@RobH Please do not assign service puppet roles on new hosts. That will almost never work. Just add new hosts in site with the "insetup" role and hand over. Service implementation will be to apply the role along with other changes in Hiera etc.

Change 635011 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] updating deploy1002 to insetup

https://gerrit.wikimedia.org/r/635011

Change 635011 merged by RobH:
[operations/puppet@production] updating deploy1002 to insetup

https://gerrit.wikimedia.org/r/635011

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

['deploy1002.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202010191603_robh_1113.log.

Completed auto-reimage of hosts:

['deploy1002.eqiad.wmnet']

Of which those FAILED:

['deploy1002.eqiad.wmnet']

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

['deploy1002.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202010191632_robh_29964.log.

Completed auto-reimage of hosts:

['deploy1002.eqiad.wmnet']

and were ALL successful.

RobH reopened this task as Open.
RobH reassigned this task from RobH to Cmjohnson.
RobH removed a project: Patch-For-Review.
RobH updated the task description. (Show Details)

I shouldn't have resolved, hostname label has to go on.

@Cmjohnson: Once the hostname label is applied to deploy1002, this can be resolved: https://netbox.wikimedia.org/dcim/devices/2139/

C8 - U17 - WMF5177 - deploy1002