Page MenuHomePhabricator

Decommission db2042
Closed, ResolvedPublic

Description

db2042 can be decommissioned

db2042

Decommission Checklist

  • - all system services confirmed offline from production use - should be done by DBA team
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration - should be done by DBA team:
  • - any service group puppet/heira/dsh config removed - should be done by DBA team
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.) - should be done by DBA team: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/514646/

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update status in netbox (inventory for decom, planned for spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal) : asw-d-codfw:ge-3/0/10
  • - remove all remaining puppet references (include role::spare)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) use hdparm for ssds and wipe for hdds
  • - label BBU as broken so it doesn't get re-used T214264
  • -system unracked and decommissioned (by onsite), update status in netbox to offline
  • -switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.

Event Timeline

Marostegui moved this task from Triage to In progress on the DBA board.
Marostegui updated the task description. (Show Details)
Marostegui triaged this task as Medium priority.Jun 5 2019, 1:02 PM
Marostegui updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-operations) [2019-06-06T05:11:41Z] <marostegui> Disable notifications db2042 - T225090

Marostegui updated the task description. (Show Details)Jun 6 2019, 5:12 AM

Mentioned in SAL (#wikimedia-operations) [2019-06-06T05:14:15Z] <marostegui> Stop MySQL on db2042 to copy its content to dbprov2001 as a temporary backup - T225090

Mentioned in SAL (#wikimedia-operations) [2019-06-06T05:18:56Z] <marostegui> Remove db2042 from tendril and zarcillo T225090

Change 514646 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Prepare db2042 for decommission

https://gerrit.wikimedia.org/r/514646

Change 514646 merged by Marostegui:
[operations/puppet@production] mariadb: Prepare db2042 for decommission

https://gerrit.wikimedia.org/r/514646

Marostegui updated the task description. (Show Details)Jun 6 2019, 5:30 AM
Marostegui reassigned this task from Marostegui to RobH.Jun 6 2019, 5:33 AM
Marostegui added a subscriber: Papaul.

db2042 is ready for DCOPs to take over.

Restricted Application added a project: Operations. · View Herald TranscriptJun 6 2019, 5:33 AM
RobH moved this task from Backlog to Decommission on the ops-codfw board.Jun 12 2019, 1:10 PM
RobH updated the task description. (Show Details)Jul 15 2019, 11:08 PM

Change 523350 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom db2042

https://gerrit.wikimedia.org/r/523350

Change 523353 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] decom db2042

https://gerrit.wikimedia.org/r/523353

Change 523350 merged by RobH:
[operations/puppet@production] decom db2042

https://gerrit.wikimedia.org/r/523350

Change 523353 merged by RobH:
[operations/dns@master] decom db2042

https://gerrit.wikimedia.org/r/523353

cookbooks.sre.hosts.decommission executed by robh@cumin1001 for hosts: db2042.codfw.wmnet

  • db2042.codfw.wmnet
    • Removed from Puppet master and PuppetDB
    • Downtimed host on Icinga
    • Downtimed management interface on Icinga
    • Removed from DebMonitor
RobH updated the task description. (Show Details)
RobH reassigned this task from RobH to Papaul.Jul 15 2019, 11:34 PM
RobH updated the task description. (Show Details)
Papaul updated the task description. (Show Details)Jul 23 2019, 5:24 PM
Papaul updated the task description. (Show Details)Jul 31 2019, 7:52 PM

Change 526762 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/dns@master] DNS: Remove DNS entires for db2042

https://gerrit.wikimedia.org/r/526762

Change 526762 merged by Marostegui:
[operations/dns@master] DNS: Remove DNS entires for db2042

https://gerrit.wikimedia.org/r/526762

Papaul closed this task as Resolved.Aug 1 2019, 1:20 PM
Papaul updated the task description. (Show Details)

Complete