Page MenuHomePhabricator

Decommission rhenium
Closed, ResolvedPublic

Description

This task will track the decommission of rhenium.wikimedia.org

The first 5 steps should be completed by the service owner that is returning the server to DC-ops (for reclaim to spare or decommissioning, dependent on server configuration and age.)

Steps for service owner:

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system)
  • - unassign service owner from this task, check off completed steps, and assign to @RobH for followup on below steps.

Steps for DC-Ops:

The following steps cannot be interrupted, as it will leave the system in an unfinished state.

Start non-interrupt steps:

End non-interrupt steps.

  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update netbox with result and change status of hardware to 'offline' when unracked.
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: add system to decommission tracking google sheet
  • - IF DECOM: mgmt dns entries removed.

Event Timeline

Change 512327 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Decommission rhenium

https://gerrit.wikimedia.org/r/512327

Change 512327 merged by Muehlenhoff:
[operations/puppet@production] Decommission rhenium

https://gerrit.wikimedia.org/r/512327

MoritzMuehlenhoff updated the task description. (Show Details)

cookbooks.sre.hosts.decommission executed by robh@cumin1001 for hosts: rhenium.wikimedia.org

  • rhenium.wikimedia.org
    • Removed from Puppet master and PuppetDB
    • Downtimed host on Icinga
    • Downtimed management interface on Icinga
    • Removed from DebMonitor

Change 525635 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom rhenium

https://gerrit.wikimedia.org/r/525635

Change 525636 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] decom rhenium prod dns

https://gerrit.wikimedia.org/r/525636

Change 525635 merged by RobH:
[operations/puppet@production] decom rhenium

https://gerrit.wikimedia.org/r/525635

Change 525636 merged by RobH:
[operations/dns@master] decom rhenium prod dns

https://gerrit.wikimedia.org/r/525636

RobH removed RobH as the assignee of this task.Jul 25 2019, 7:48 PM
RobH removed a project: Patch-For-Review.
RobH updated the task description. (Show Details)
Cmjohnson subscribed.

John, please wipe the servers, remove from the rack, update netbox and the tracking sheet. Assign back to me once you finish so I can kill the switch ports.

Jclark-ctr updated the task description. (Show Details)
Jclark-ctr subscribed.

Host wiped, netbox updated status set offline removed from rack, and added to hardware tracking sheet

papaul@asw2-a-eqiad# show | compare 
[edit interfaces]
-   ge-4/0/17 {
-       description rhenium;
-   }

Change 541928 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/dns@master] DNS: Remove mgmt DNS for rhenium and lithium

https://gerrit.wikimedia.org/r/541928

Papaul added a subscriber: RobH.

Change 541928 merged by Papaul:
[operations/dns@master] DNS: Remove mgmt DNS for rhenium and lithium

https://gerrit.wikimedia.org/r/541928