Page MenuHomePhabricator

(Need By: 31st May) rack/setup/install db114[1-9]
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of 9 new database hosts in eqiad.

Hostname / Racking / Installation Details

Hostnames: Depends on the number of hosts, but if we order 9: db1141-db1148
Racking Proposal: Ideally 2 per row in separate racks. The one left, in any row, but in a separate rack from the other.
Networking/Subnet/VLAN/IP: Private VLAN like the rest of the databases, in 1G rack (we don't need 10G here, but we want to order them with 10G just in case, for the future)
Partitioning/Raid: RAID10 + 256k stripe size. Normal db raid recipe @Marostegui will take care of this puppet part.
OS Distro: Buster

Need by info

In T245137#6099414, @Marostegui wrote:

@wiki_willy Can we bump the priority for racking these hosts? They are needed for the expansion of commonswiki, which will also allow us to move servers to wikidatawiki, which is in need for more CPU power.
If someone creates the racking task (I believe there is not one yet - CCing @RobH just in case for T245137#6073133) we can follow up there but I am happy to install the hosts myself once they've got the RAID created, DNS entries added and the network configured on a switch level.
Once all that is done, just give me the MAC addresses and I can take care of that

Thank you.

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

db1141:

  • - receive in system on procurement task T245137
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

db1142:

  • - receive in system on procurement task T245137
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

db1143:

  • - receive in system on procurement task T245137
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

db1144:

  • - receive in system on procurement task T245137
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

db1145:

  • - receive in system on procurement task T245137
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

db1146:

  • - receive in system on procurement task T245137
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

db1147:

  • - receive in system on procurement task T245137
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

db1148:

  • - receive in system on procurement task T245137
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

db1149:

  • - receive in system on procurement task T245137
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Event Timeline

RobH renamed this task from (<enter due date here>) rack/setup/install <insert FQDN/hostname of hardware here> to (Need By: ASAP) rack/setup/install db114[1-9].May 1 2020, 4:13 PM
RobH added a parent task: Unknown Object (Task).
RobH moved this task from Backlog to Racking Tasks on the ops-eqiad board.
Marostegui renamed this task from (Need By: ASAP) rack/setup/install db114[1-9] to (Need By: 31st May) rack/setup/install db114[1-9].May 4 2020, 5:16 AM
Marostegui added a project: DBA.
Marostegui updated the task description. (Show Details)

Change 594616 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] install_server: Allow reimage db114[1-9]

https://gerrit.wikimedia.org/r/594616

Change 594616 merged by Marostegui:
[operations/puppet@production] install_server: Allow reimage db114[1-9]

https://gerrit.wikimedia.org/r/594616

Puppet done to get them as spare.
As I said, I can do the install myself, what is pending from DCOPs side (apart from racking and cabling)

  • Production DNS entries
  • Switches configuration
  • Add the hosts to dhcp in puppet (or send me the MAC addresses and I can do that patch too).
Jclark-ctr updated the task description. (Show Details)
Jclark-ctr added a subscriber: RobH.
Jclark-ctr added a subscriber: Jclark-ctr.

name rack_name position switchport
db1141 A3 1 7
db1142 A5 36 36
db1143 B3 32 26
db1144 B8 7 13
db1145 C5 9 8
db1146 C5 33 32
db1147 C6 38 37
db1148 D1 9 6
db1149 D6 31 30

Change 595563 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Adding mgmt dns for db1141-48

https://gerrit.wikimedia.org/r/595563

Change 595563 merged by Cmjohnson:
[operations/dns@master] Adding mgmt dns for db1141-49

https://gerrit.wikimedia.org/r/595563

Change 595594 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Adding production dns for db1141-1149

https://gerrit.wikimedia.org/r/595594

Change 595595 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/puppet@production] Adding db1141-1149 mac addresses to dhcpd file

https://gerrit.wikimedia.org/r/595595

Change 595594 merged by Cmjohnson:
[operations/dns@master] Adding production dns for db1141-1149

https://gerrit.wikimedia.org/r/595594

Change 595595 merged by Cmjohnson:
[operations/puppet@production] Adding db1141-1149 mac addresses to dhcpd file

https://gerrit.wikimedia.org/r/595595

Change 595600 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] install_server: db114[1-9] need to be installed with buster

https://gerrit.wikimedia.org/r/595600

Change 595600 merged by Marostegui:
[operations/puppet@production] install_server: db114[1-9] need to be installed with buster

https://gerrit.wikimedia.org/r/595600

@Cmjohnson I have ammended your patch for the DHCP to make sure they use the Buster installer.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1148.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111748_cmjohnson_57444_db1148_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1149.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111755_cmjohnson_59257_db1149_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1141.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111801_cmjohnson_59798_db1141_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1142.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111801_cmjohnson_59887_db1142_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1148.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111802_cmjohnson_59949_db1148_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1143.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111812_cmjohnson_63495_db1143_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['db1149.eqiad.wmnet']

and were ALL successful.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1144.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111817_cmjohnson_65176_db1144_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['db1141.eqiad.wmnet']

and were ALL successful.

Completed auto-reimage of hosts:

['db1142.eqiad.wmnet']

and were ALL successful.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1145.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111823_cmjohnson_68326_db1145_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1146.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111824_cmjohnson_68370_db1146_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['db1148.eqiad.wmnet']

and were ALL successful.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1147.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111825_cmjohnson_68571_db1147_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['db1143.eqiad.wmnet']

and were ALL successful.

Completed auto-reimage of hosts:

['db1144.eqiad.wmnet']

and were ALL successful.

Completed auto-reimage of hosts:

['db1146.eqiad.wmnet']

and were ALL successful.

Completed auto-reimage of hosts:

['db1147.eqiad.wmnet']

and were ALL successful.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1145.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005111851_cmjohnson_76723_db1145_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['db1145.eqiad.wmnet']

Of which those FAILED:

['db1145.eqiad.wmnet']

Completed auto-reimage of hosts:

['db1145.eqiad.wmnet']

and were ALL successful.

Cmjohnson updated the task description. (Show Details)

These are all yours @Marostegui

Thank you! They look good:

_____FORMATTED_OUTPUT_____
db1141.eqiad.wmnet: Filesystem            Type  Size  Used Avail Use% Mounted on
db1141.eqiad.wmnet: /dev/mapper/tank-data xfs   7.6T  8.2G  7.6T   1% /srv
db1141.eqiad.wmnet:               total        used        free      shared  buff/cache   available
db1141.eqiad.wmnet: Mem:            502           0         501           0           0         499
db1141.eqiad.wmnet: Swap:             7           0           7
db1142.eqiad.wmnet: Filesystem            Type  Size  Used Avail Use% Mounted on
db1142.eqiad.wmnet: /dev/mapper/tank-data xfs   7.6T  8.2G  7.6T   1% /srv
db1142.eqiad.wmnet:               total        used        free      shared  buff/cache   available
db1142.eqiad.wmnet: Mem:            502           0         501           0           0         499
db1142.eqiad.wmnet: Swap:             7           0           7
db1143.eqiad.wmnet: Filesystem            Type  Size  Used Avail Use% Mounted on
db1143.eqiad.wmnet: /dev/mapper/tank-data xfs   7.6T  8.2G  7.6T   1% /srv
db1143.eqiad.wmnet:               total        used        free      shared  buff/cache   available
db1143.eqiad.wmnet: Mem:            502           0         501           0           0         499
db1143.eqiad.wmnet: Swap:             7           0           7
db1144.eqiad.wmnet: Filesystem            Type  Size  Used Avail Use% Mounted on
db1144.eqiad.wmnet: /dev/mapper/tank-data xfs   7.6T  8.2G  7.6T   1% /srv
db1144.eqiad.wmnet:               total        used        free      shared  buff/cache   available
db1144.eqiad.wmnet: Mem:            502           0         501           0           0         499
db1144.eqiad.wmnet: Swap:             7           0           7
db1145.eqiad.wmnet: Filesystem            Type  Size  Used Avail Use% Mounted on
db1145.eqiad.wmnet: /dev/mapper/tank-data xfs   7.6T  8.2G  7.6T   1% /srv
db1145.eqiad.wmnet:               total        used        free      shared  buff/cache   available
db1145.eqiad.wmnet: Mem:            502           0         501           0           0         499
db1145.eqiad.wmnet: Swap:             7           0           7
db1146.eqiad.wmnet: Filesystem            Type  Size  Used Avail Use% Mounted on
db1146.eqiad.wmnet: /dev/mapper/tank-data xfs   7.6T  8.2G  7.6T   1% /srv
db1146.eqiad.wmnet:               total        used        free      shared  buff/cache   available
db1146.eqiad.wmnet: Mem:            502           0         501           0           0         499
db1146.eqiad.wmnet: Swap:             7           0           7
db1147.eqiad.wmnet: Filesystem            Type  Size  Used Avail Use% Mounted on
db1147.eqiad.wmnet: /dev/mapper/tank-data xfs   7.6T  8.2G  7.6T   1% /srv
db1147.eqiad.wmnet:               total        used        free      shared  buff/cache   available
db1147.eqiad.wmnet: Mem:            502           0         501           0           0         499
db1147.eqiad.wmnet: Swap:             7           0           7
db1148.eqiad.wmnet: Filesystem            Type  Size  Used Avail Use% Mounted on
db1148.eqiad.wmnet: /dev/mapper/tank-data xfs   7.6T  8.2G  7.6T   1% /srv
db1148.eqiad.wmnet:               total        used        free      shared  buff/cache   available
db1148.eqiad.wmnet: Mem:            502           0         501           0           0         499
db1148.eqiad.wmnet: Swap:             7           0           7
db1149.eqiad.wmnet: Filesystem            Type  Size  Used Avail Use% Mounted on
db1149.eqiad.wmnet: /dev/mapper/tank-data xfs   7.6T  8.2G  7.6T   1% /srv
db1149.eqiad.wmnet:               total        used        free      shared  buff/cache   available
db1149.eqiad.wmnet: Mem:            502           0         501           0           0         499
db1149.eqiad.wmnet: Swap:             7           0           7

_____FORMATTED_OUTPUT_____
db1141.eqiad.wmnet: RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
db1141.eqiad.wmnet: Strip Size          : 256 KB
db1142.eqiad.wmnet: RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
db1142.eqiad.wmnet: Strip Size          : 256 KB
db1143.eqiad.wmnet: RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
db1143.eqiad.wmnet: Strip Size          : 256 KB
db1144.eqiad.wmnet: RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
db1144.eqiad.wmnet: Strip Size          : 256 KB
db1145.eqiad.wmnet: RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
db1145.eqiad.wmnet: Strip Size          : 256 KB
db1146.eqiad.wmnet: RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
db1146.eqiad.wmnet: Strip Size          : 256 KB
db1147.eqiad.wmnet: RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
db1147.eqiad.wmnet: Strip Size          : 256 KB
db1148.eqiad.wmnet: RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
db1148.eqiad.wmnet: Strip Size          : 256 KB
db1149.eqiad.wmnet: RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
db1149.eqiad.wmnet: Strip Size          : 256 KB