Page MenuHomePhabricator

(Need By: 2020-11-29) rack/setup/install db214[234]
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of db214[234]

Hostname / Racking / Installation Details

Hostnames: db2142, db2143, db2144
Racking Proposal: Any 1G rack that works for DC-Ops as long as we have one host per row
Networking/Subnet/VLAN/IP: 1G private VLAN as any other database
Partitioning/Raid: RAID10 with 256 stripe size and writeback as documented at https://wikitech.wikimedia.org/wiki/Raid_and_MegaCli#Raid_setup_at_Wikimedia
OS Distro: Buster

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

db2142: Rack A3 U17 ge-3/0/16

  • - receive in system on procurement task T264583 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

db2143: Rack B1 U12 ge-1/0/23

  • - receive in system on procurement task T264583 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

db2144: Rack C3 U18 ge-3/0/17

  • - receive in system on procurement task T264583 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
RobH added a parent task: Unknown Object (Task).Nov 2 2020, 4:28 PM
RobH moved this task from Backlog to Racking Tasks on the ops-codfw board.
RobH unsubscribed.

Change 639669 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Initial setup for db214[234]

https://gerrit.wikimedia.org/r/639669

Change 639669 merged by Marostegui:
[operations/puppet@production] mariadb: Initial setup for db214[234]

https://gerrit.wikimedia.org/r/639669

I have merged the puppet changes needed for the initial installation (puppet for insetup and the partman recipe).
Pending merges from DC-Ops:

  • DNS
  • DHCP entries

Change 644391 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/puppet@production] Add db214[234] and logstash203[345] to site.pp

https://gerrit.wikimedia.org/r/644391

Change 644391 merged by Papaul:
[operations/puppet@production] Add db214[234] and logstash203[345] to site.pp

https://gerrit.wikimedia.org/r/644391

Change 644632 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/puppet@production] DHCP: Add MAC address for db214[234]

https://gerrit.wikimedia.org/r/644632

Change 644632 merged by Papaul:
[operations/puppet@production] DHCP: Add MAC address for db214[234]

https://gerrit.wikimedia.org/r/644632

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

db2142.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202012012329_pt1979_12608_db2142_codfw_wmnet.log.

Completed auto-reimage of hosts:

['db2142.codfw.wmnet']

Of which those FAILED:

['db2142.codfw.wmnet']

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

db2142.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202012012329_pt1979_12644_db2142_codfw_wmnet.log.

Completed auto-reimage of hosts:

['db2142.codfw.wmnet']

Of which those FAILED:

['db2142.codfw.wmnet']

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

db2143.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202012021736_pt1979_30823_db2143_codfw_wmnet.log.

Completed auto-reimage of hosts:

['db2143.codfw.wmnet']

Of which those FAILED:

['db2143.codfw.wmnet']

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

db2144.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202012021951_pt1979_27054_db2144_codfw_wmnet.log.

Completed auto-reimage of hosts:

['db2144.codfw.wmnet']

Of which those FAILED:

['db2144.codfw.wmnet']

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

db2144.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202012022033_pt1979_2688_db2144_codfw_wmnet.log.

Completed auto-reimage of hosts:

['db2144.codfw.wmnet']

Of which those FAILED:

['db2144.codfw.wmnet']
Papaul updated the task description. (Show Details)

@Marostegui all yours

Thank you Papaul.

  • Memory looks good
  • CPUs look good
  • Disk space looks good
  • RAID level looks good
  • pvs looks good (we need to add the last 1TB there)
PV         VG   Fmt  Attr PSize  PFree
/dev/sda3  tank lvm2 a--  <8.69t <1.13t