Page MenuHomePhabricator

(Need By: TBD) rack/setup/install payments100[5-8]
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of payments100[5-8]

Hostname / Racking / Installation Details

Hostnames: payments100[5-8]
Racking Proposal: frack
Networking/Subnet/VLAN/IP: connect both 1GBE ports, one per frack asw

  • payments1005.frack.eqiad.wmnet/frack-payments-eqiad/10.64.40.9
  • payments1006.frack.eqiad.wmnet/frack-payments-eqiad/10.64.40.10
  • payments1007.frack.eqiad.wmnet/frack-payments-eqiad/10.64.40.11
  • payments1008.frack.eqiad.wmnet/frack-payments-eqiad/10.64.40.12

Partitioning/Raid:
OS Distro:

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

  • payments1005:
    • - receive in system on procurement task T265064 & in coupa
    • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
    • - bios/drac/serial setup/testing
    • - mgmt dns entries added for both asset tag and hostname
    • - network port setup (description, enable, vlan)
    • end on-site specific steps
    • - DRAC password fixed
    • - DRAC console redirection after boot fixed
    • - production dns entries added
    • - No OS installation, hand off to fundraising to install - fundraising to set to active in netbox when they complete installation.
  • payments1006:
    • - receive in system on procurement task T265064 & in coupa
    • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
    • - bios/drac/serial setup/testing
    • - mgmt dns entries added for both asset tag and hostname
    • - network port setup (description, enable, vlan)
    • end on-site specific steps
    • - DRAC password fixed
    • - DRAC console redirection after boot fixed
    • - production dns entries added
    • - No OS installation, hand off to fundraising to install - fundraising to set to active in netbox when they complete installation.
  • payments1007:
    • - receive in system on procurement task T265064 & in coupa
    • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
    • - bios/drac/serial setup/testing
    • - mgmt dns entries added for both asset tag and hostname
    • - network port setup (description, enable, vlan)
    • end on-site specific steps
    • - DRAC password fixed
    • - DRAC console redirection after boot fixed
    • - production dns entries added
    • - No OS installation, hand off to fundraising to install - fundraising to set to active in netbox when they complete installation.
  • payments1008:
    • - receive in system on procurement task T265064 & in coupa
    • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
    • - bios/drac/serial setup/testing
    • - mgmt dns entries added for both asset tag and hostname
    • - network port setup (description, enable, vlan)
    • end on-site specific steps
    • - DRAC password fixed
    • - DRAC console redirection after boot fixed
    • - production dns entries added
    • - No OS installation, hand off to fundraising to install - fundraising to set to active in netbox when they complete installation.

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Event Timeline

RobH added a parent task: Unknown Object (Task).
RobH moved this task from Backlog to Racking Tasks on the ops-eqiad board.

Hi @Jgreen - it looks like we're running a bit tight on space in the Fundraising rack. In order for us to rack the servers for this install, do you have 1-2 existing servers that can be decommissioned in eqiad? Thanks, Willy

Hi @Jgreen - it looks like we're running a bit tight on space in the Fundraising rack. In order for us to rack the servers for this install, do you have 1-2 existing servers that can be decommissioned in eqiad? Thanks, Willy

There are a couple DC Ops tasks in the works that will lead to removing old hardware:

T271739: decommission frdb1001.frack.eqiad.wmnet
T266365: (Need By: TBD) rack/setup/install frqueue100[34]

Let's do those first, and then do the payments servers two at a time?

Thanks @Jgreen (cc'ing @Jclark-ctr as a fyi)

Hi @Jgreen - it looks like we're running a bit tight on space in the Fundraising rack. In order for us to rack the servers for this install, do you have 1-2 existing servers that can be decommissioned in eqiad? Thanks, Willy

There are a couple DC Ops tasks in the works that will lead to removing old hardware:

T271739: decommission frdb1001.frack.eqiad.wmnet
T266365: (Need By: TBD) rack/setup/install frqueue100[34]

Let's do those first, and then do the payments servers two at a time?

@wiki_willy @Jclark-ctr we're done with frqueue1002 and can be decommed and removed T274671: decommission frqueue1002.frack.eqiad.wmnet. When you're ready start on payments boxes, we can also shut down payments1004 which is a standby box and not in active service.

Thanks @Jgreen, do you have a decom task for payments1004 as well?

@wiki_willy @Jclark-ctr we're done with frqueue1002 and can be decommed and removed T274671: decommission frqueue1002.frack.eqiad.wmnet. When you're ready start on payments boxes, we can also shut down payments1004 which is a standby box and not in active service.

@Cmjohnson
4 Servers racked updated netbox. servers use same ports in both switches
payments1005 U2. P30
payments1006 U26. P28
payments1007. U14. P30
payments1008. U28. P29

@wiki_willy @Jclark-ctr we're done with frqueue1002 and can be decommed and removed T274671: decommission frqueue1002.frack.eqiad.wmnet. When you're ready start on payments boxes, we can also shut down payments1004 which is a standby box and not in active service.

I see frqueue1002 was decomm'ed but the sre.dns.netbox cookbook was not run for its mgmt records, so Icinga started alerting because of:

Uncommitted DNS changes in Netbox on netbox1001 is CRITICAL: Netbox has uncommitted DNS changes
https://wikitech.wikimedia.org/wiki/Monitoring/Netbox_DNS_uncommitted_changes

I'm running the cookbook now to clear the mgmt records and the alert.
Please make sure that the documentation reflects that the cookbook must be run also when decommissioning frack hosts because the management records are automatically managed.

Thanks. I was mistaken that network dns was done by fundraising

updated port locations

1005 28
1006 29
1007 30
1008 31

Cmjohnson edited projects, added fundraising-tech-ops; removed ops-eqiad, DC-Ops.

Assigning this to @Jgreen to complete the installs. All the on-site work has been completed, network ports are set up and enabled so please install quickly. Please resolve this task once completed and let me know if you need anything.
Removing ops-eqiad and DC Ops tag, adding fundraising tech-ops.

Change 673342 had a related patch set uploaded (by Jgreen; owner: Jgreen):
[operations/dns@master] add A/PTR records for payments100[5-8].frack.eqiad.wmnet

https://gerrit.wikimedia.org/r/673342

Change 673342 merged by Jgreen:
[operations/dns@master] add A/PTR records for payments100[5-8].frack.eqiad.wmnet

https://gerrit.wikimedia.org/r/673342

Jgreen updated the task description. (Show Details)

@Cmjohnson @Jclark-ctr Can you take a look at the serial settings for payments1006? Console redirection isn't working at all. I tried to fix it using racadm but although it seems to accept the settings there's still no DRAC access on com2. Maybe there's something else going on?

Jclark-ctr closed subtask Unknown Object (Task) as Resolved.Apr 16 2021, 6:05 PM
Dwisehaupt subscribed.

Re-adding the ops tags to get payments1006 back on the radar for the console redirection.

Backing that out and removing the ops tags as @Jgreen created T280527 this morning specifically about the console issue.

Change 682186 had a related patch set uploaded (by Dwisehaupt; author: Dwisehaupt):

[operations/puppet@production] Add new payments hosts to monitoring

https://gerrit.wikimedia.org/r/682186

Change 682186 merged by Jgreen:

[operations/puppet@production] Add new payments hosts to monitoring

https://gerrit.wikimedia.org/r/682186

Jgreen moved this task from Blocked to Done on the fundraising-tech-ops board.