Page MenuHomePhabricator

(Need by:TBD) rack/setup/install backup2002
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of backup2002

Hostname / Racking / Installation Details

Hostnames: For the host, backup2002. For labeling only, for the array backup2002-array1.
Racking Proposal: 2 rules- Rack it on 10G-available racks, and try to avoid, if possible, proximity to backup2001 (D2)) -e.g. not D row- for redundancy. Other than that, it can be anywhere (there is only 2 backup hosts on each DC) .
Networking/Subnet/VLAN/IP: 10G, only one network, production-codfw subnetwork.
Partitioning/Raid: The 2 SSDs will be in RAID1 software. The array disks will be in RAID6 hw (raid controller). Because this will have 100TB, we may want to create 2 virtual RAID disks, each with RAID6. DBA can take care of most of this (this is a special host), as long as remote access to partitioning is available.
Partman: backup-format.cfg (was raid1-lvm-ext4-srv-plus-hwraid.cfg before https://gerrit.wikimedia.org/r/c/operations/puppet/+/584559)

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

backup2002:

  • - receive in system on procurement task T238601
  • - receive in disk shelf on procurement task T238601
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
RobH moved this task from Backlog to Hardware Failure / Troubleshoot on the ops-codfw board.

this is handled by https://phabricator.wikimedia.org/T248934 and not linked to parent task