Page MenuHomePhabricator

(Need by: TBD) rack/setup/install restbase1028, restbase1029, restbase1030
Closed, ResolvedPublic

Description

This task tracks the racking and setup of 3 new restbase nodes in eqiad: restbase1028, restbase1029, restbase1030

Hostnames: restbase1028, restbase1029, restbase1030
Racking Proposal: Place 1 each in 1G rack in A, B, and D. per T238580#5710739.
Networking/Subnet/VLAN/IP: single 1G production network connection, match other restbase nodes.
Partitioning/Raid: match existing restbase nodes

restbase1028:

  • - receive in system on procurement task T238580
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - add to site.pp role(insetup)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged
  • - handoff for service implementation

restbase1029:

  • - receive in system on procurement task T238580
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - add to site.pp role(insetup)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged
  • - handoff for service implementation

restbase1030:

  • - receive in system on procurement task T238580
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - add to site.pp role(insetup)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged
  • - handoff for service implementation

Event Timeline

RobH triaged this task as Medium priority.Jan 2 2020, 9:31 PM
RobH created this task.
wiki_willy renamed this task from rack/setup/install restbase1029, restbase1029, restbase1030 to (No Need By Date) rack/setup/install restbase1029, restbase1029, restbase1030.Jan 2 2020, 11:33 PM
In T238580#5710739, @Eevans wrote:
In T238580#5709953, @RobH wrote:

Also note I assumed details for the racking/hostnames and would appreciate confirmation of those details in task description, thanks!

This cluster uses a replication count of 3 (per-DC), and for eqiad we have machines evenly distributed over a, b, and d. This replica-to-row affinity makes it very nice to reason about where data will be moving from/to on topology changes and it would be a shame if we lost that now. Will there be a problem keeping these to the same 3 rows currently in-use?

So the racking plan needs to adjust, one in A, B, and D.

RobH renamed this task from (No Need By Date) rack/setup/install restbase1029, restbase1029, restbase1030 to (Need by: TBD) rack/setup/install restbase1029, restbase1029, restbase1030.Feb 24 2020, 9:10 PM
Jclark-ctr renamed this task from (Need by: TBD) rack/setup/install restbase1029, restbase1029, restbase1030 to (Need by: TBD) rack/setup/install restbase1028, restbase1029, restbase1030.Apr 14 2020, 3:00 PM
Jclark-ctr added a subscriber: Jclark-ctr.

restbase1028: A5 U18 WMF4802 port. 21
restbase1029: B5 U26 WMF4803 port22
restbase1030: D4 U25 WMF4804 port 20

Change 589667 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Adding mgmt dns for restbase1028-1030

https://gerrit.wikimedia.org/r/589667

Change 589667 merged by Cmjohnson:
[operations/dns@master] Adding mgmt dns for restbase1028-1030

https://gerrit.wikimedia.org/r/589667

Change 592725 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Adding prodcution dns for restbase1028-1030

https://gerrit.wikimedia.org/r/592725

Change 592725 merged by Cmjohnson:
[operations/dns@master] Adding prodcution dns for restbase1028-1030

https://gerrit.wikimedia.org/r/592725

Change 592730 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/puppet@production] Adding new restbases to dhcpd file and netboot.cfg

https://gerrit.wikimedia.org/r/592730

Change 592730 merged by Cmjohnson:
[operations/puppet@production] Adding new restbases to dhcpd file and netboot.cfg

https://gerrit.wikimedia.org/r/592730

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1029.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202004271849_cmjohnson_27938_restbase1029_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1029.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202004271850_cmjohnson_28035_restbase1029_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1030.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202004271851_cmjohnson_28162_restbase1030_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['restbase1029.eqiad.wmnet']

Of which those FAILED:

['restbase1029.eqiad.wmnet']

Completed auto-reimage of hosts:

['restbase1029.eqiad.wmnet']

Of which those FAILED:

['restbase1029.eqiad.wmnet']

Completed auto-reimage of hosts:

['restbase1030.eqiad.wmnet']

Of which those FAILED:

['restbase1030.eqiad.wmnet']

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1030.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202004271905_cmjohnson_30692_restbase1030_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1029.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202004271905_cmjohnson_30732_restbase1029_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1029.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202004271905_cmjohnson_30768_restbase1029_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['restbase1029.eqiad.wmnet']

Of which those FAILED:

['restbase1029.eqiad.wmnet']

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1028.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202004271906_cmjohnson_30851_restbase1028_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['restbase1030.eqiad.wmnet']

Of which those FAILED:

['restbase1030.eqiad.wmnet']

Completed auto-reimage of hosts:

['restbase1028.eqiad.wmnet']

Of which those FAILED:

['restbase1028.eqiad.wmnet']

Completed auto-reimage of hosts:

['restbase1029.eqiad.wmnet']

Of which those FAILED:

['restbase1029.eqiad.wmnet']

Change 594942 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/puppet@production] Adding new restbase1028-1030 servers to site.pp role insetup

https://gerrit.wikimedia.org/r/594942

Change 594942 merged by Cmjohnson:
[operations/puppet@production] Adding new restbase1028-1030 servers to site.pp role insetup

https://gerrit.wikimedia.org/r/594942

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1028.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005071610_cmjohnson_173746_restbase1028_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1029.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005071611_cmjohnson_173872_restbase1029_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1030.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005071611_cmjohnson_173949_restbase1030_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['restbase1028.eqiad.wmnet']

Of which those FAILED:

['restbase1028.eqiad.wmnet']

Completed auto-reimage of hosts:

['restbase1029.eqiad.wmnet']

Of which those FAILED:

['restbase1029.eqiad.wmnet']

Completed auto-reimage of hosts:

['restbase1030.eqiad.wmnet']

Of which those FAILED:

['restbase1030.eqiad.wmnet']

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1028.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005081110_cmjohnson_46988_restbase1028_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['restbase1028.eqiad.wmnet']

Of which those FAILED:

['restbase1028.eqiad.wmnet']

Change 595475 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site: fix typo in role for new restbase hosts

https://gerrit.wikimedia.org/r/595475

Change 595475 merged by Dzahn:
[operations/puppet@production] site: fix typo in role for new restbase hosts

https://gerrit.wikimedia.org/r/595475

Fixed typo above, this should have been why the reimage script above failed. You can either run it again or skip ahead to the install-console step and it should work.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1028.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111127_cmjohnson_7299_restbase1028_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1029.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111128_cmjohnson_7374_restbase1029_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

restbase1030.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111128_cmjohnson_7476_restbase1030_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['restbase1029.eqiad.wmnet']

and were ALL successful.

Completed auto-reimage of hosts:

['restbase1028.eqiad.wmnet']

and were ALL successful.

Completed auto-reimage of hosts:

['restbase1030.eqiad.wmnet']

and were ALL successful.

@mobrovac these servers are ready for service implementation. I am resolving the racking task.