Page MenuHomePhabricator

(Due By: 2020-07-25) rack/setup/install alert1001
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of alert1001

Hostname / Racking / Installation Details

Hostnames: What are the hostnames, and have you updated https://wikitech.wikimedia.org/wiki/Infrastructure_naming_conventions ?

  • alert1001.wikimedia.org (name convention updated)

Racking Proposal: Where should these systems be racked? Can they share with any existing systems or should they avoid any other systems sharing their rack or row?

  • Same location as icinga1001 please

Networking/Subnet/VLAN/IP: What are the network details? 1G or 10G? Only one network port connection, or more? Subnet/vlan and IP requirements per connect?

  • 1G, prod public

Partitioning/Raid: Is this hardware or software raid and what raid levels should be applied to each disk? What are the partitioning requirements and is there an existing partman recipe?

  • SW raid, existing icinga recipe should do the right thing

OS Distro: Stretch or Buster?

  • Buster -- might as well handle OS upgrade during this HW refresh

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

alert1001:

  • - receive in system on procurement task T253040
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Event Timeline

RobH added a parent task: Unknown Object (Task).Jun 10 2020, 7:56 PM
RobH updated the task description. (Show Details)
RobH moved this task from Backlog to Racking Tasks on the ops-eqiad board.
fgiunchedi renamed this task from (Need By: TBD) rack/setup/install icinga1002 to (Need By: TBD) rack/setup/install alert1001.Jun 11 2020, 3:03 PM
fgiunchedi updated the task description. (Show Details)
wiki_willy renamed this task from (Need By: TBD) rack/setup/install alert1001 to (Due By: 2020-07-25) rack/setup/install alert1001.Jun 26 2020, 6:59 PM

@fgiunchedi icinga1001 is in rack C8, that is now a 10G rack. Do you still want this server there or can we move to another rack that is 1G only and eventually migrate icinga1001 to the same rack?

@fgiunchedi icinga1001 is in rack C8, that is now a 10G rack. Do you still want this server there or can we move to another rack that is 1G only and eventually migrate icinga1001 to the same rack?

Thanks for the heads up, any 1G rack will do in this case. We'll eventually decom icinga1001 once alert1001 is in service so no need for icinga1001 to move

Server Asset tag Rack Switch port
icinga1001 WMF5405 c6 36

Hello, friendly ping -- when should we expect alert1001 to be online?

Change 617096 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Adding mgmt dns for alert1001 to dns file, netbox aleady updated

https://gerrit.wikimedia.org/r/617096

Change 617096 merged by Cmjohnson:
[operations/dns@master] Adding mgmt dns for alert1001 to dns file, netbox aleady updated

https://gerrit.wikimedia.org/r/617096

Change 617101 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Adding production dns alert1001, public ip with ipv6

https://gerrit.wikimedia.org/r/617101

Change 617101 merged by Cmjohnson:
[operations/dns@master] Adding production dns alert1001, public ip with ipv6

https://gerrit.wikimedia.org/r/617101

Change 617106 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/puppet@production] Adding alert1001 to site.pp and dhpd file

https://gerrit.wikimedia.org/r/617106

Change 617106 merged by Cmjohnson:
[operations/puppet@production] Adding alert1001 to site.pp and dhpd file

https://gerrit.wikimedia.org/r/617106

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

alert1001.wikimedia.org

The log can be found in /var/log/wmf-auto-reimage/202007291250_cmjohnson_20501_alert1001_wikimedia_org.log.

Cmjohnson updated the task description. (Show Details)

Completed auto-reimage of hosts:

['alert1001.wikimedia.org']

and were ALL successful.