Page MenuHomePhabricator

Decommission db2043.codfw.wmnet
Closed, ResolvedPublic

Description

This task will track the decommission-hardware of server db2043.codfw.wmnet

The first 5 steps should be completed by the service owner that is returning the server to DC-ops (for reclaim to spare or decommissioning, dependent on server configuration and age.)

db2043
Steps for service owner:

Steps for DC-Ops:

The following steps cannot be interrupted, as it will leave the system in an unfinished state.

Start non-interrupt steps:

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

End non-interrupt steps.

  • - label disk #3 as broken so it doesn't get re-used
  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: add system to decommission tracking google sheet
  • - IF DECOM: mgmt dns entries removed.
  • - IF RECLAIM: system added back to spares tracking (by onsite)

Event Timeline

Marostegui triaged this task as Medium priority.
Marostegui moved this task from Triage to In progress on the DBA board.

Change 529716 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Decommission db2043

https://gerrit.wikimedia.org/r/529716

Change 529717 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db2043 from config

https://gerrit.wikimedia.org/r/529717

Change 529717 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db2043 from config

https://gerrit.wikimedia.org/r/529717

Mentioned in SAL (#wikimedia-operations) [2019-08-12T09:21:23Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Remove db2043 from config T230311 (duration: 00m 48s)

Mentioned in SAL (#wikimedia-operations) [2019-08-12T09:22:02Z] <marostegui> Remove db2043 from tendril and zarcillo T230311

Mentioned in SAL (#wikimedia-operations) [2019-08-12T09:22:19Z] <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Remove db2043 from config T230311 (duration: 00m 47s)

Change 529716 merged by Marostegui:
[operations/puppet@production] mariadb: Decommission db2043

https://gerrit.wikimedia.org/r/529716

Marostegui updated the task description. (Show Details)
Marostegui edited projects, added ops-codfw; removed Patch-For-Review, DBA.

This host is ready for DC-Ops to decommission

Change 529851 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] install_server: Remove db2043

https://gerrit.wikimedia.org/r/529851

Change 529851 merged by Marostegui:
[operations/puppet@production] install_server: Remove db2043

https://gerrit.wikimedia.org/r/529851

cookbooks.sre.hosts.decommission executed by marostegui@cumin1001 for hosts: db2043.codfw.wmnet

  • db2043.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 538157 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] db2043: Remove it from puppet

https://gerrit.wikimedia.org/r/538157

Change 538157 merged by Marostegui:
[operations/puppet@production] db2043: Remove it from puppet

https://gerrit.wikimedia.org/r/538157

Change 538158 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/dns@master] db2043: Decom dns production entries

https://gerrit.wikimedia.org/r/538158

Change 538158 merged by Marostegui:
[operations/dns@master] db2043: Decom dns production entries

https://gerrit.wikimedia.org/r/538158

Marostegui updated the task description. (Show Details)
Marostegui added subscribers: Muehlenhoff, Papaul.

I have tried the new cookbook to decom servers after having a chat with @Muehlenhoff - I think this is now ready for @Papaul to fully decom it!

papaul@asw-c-codfw# show | compare 
[edit interfaces interface-range vlan-private1-c-codfw]
-    member ge-6/0/12;
[edit interfaces interface-range disabled]
     member ge-6/0/8 { ... }
+    member ge-6/0/12;
[edit interfaces]
-   ge-6/0/12 {
-       description db2043;
-       enable;
-   }

Change 540230 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/dns@master] DNS: Remove mgmt DNS for db2041 db204[3-4] db204[6-7] and db2049

https://gerrit.wikimedia.org/r/540230

Change 540230 merged by Papaul:
[operations/dns@master] DNS: Remove mgmt DNS for db2041 db204[3-4] db204[6-7] and db2049

https://gerrit.wikimedia.org/r/540230

Papaul updated the task description. (Show Details)

Complete

DannyS712 subscribed.

[batch] remove patch for review tag from resolved tasks