Page MenuHomePhabricator

decommission stat1001
Closed, ResolvedPublic

Description

original task request

thorium has fully replaced stat1001. I have disabled active checks and notifications for this node in icinga, and enabled role spare::system. It should be good to disappear. Thanks!

addressing reclaim or decom

Since the system warranty expired on 2014-04-29, it is well out of warranty. All other hosts of this age are typically decommissioned as they fall out of use.

steps for decommission

  • - remove system from all puppet/heira/active service use/add to role::spare - (otto handed this)

Please note that patchsets for all remaining steps cannot be merged until system is offline! Don't merge changes to remove public DNS if the system can come back online and then have an IP not tracked in our DNS config.

  • remaining steps must all be done/merged in a short order/sequence. This is non-interrupt based tasking.
  • - disable system from all icinga monitoring
  • - power down system
  • - disable system switch port (THIS MUST HAPPEN BEFORE THE REST HAPPENS)
  • - remove salt/puppet entries (puppet node clean, puppet node deactive, salt-key -d)
  • - remove puppet entries - https://gerrit.wikimedia.org/r/#/c/332470/
  • - remove system production dns entries - https://gerrit.wikimedia.org/r/#/c/332472/
  • - remove system site.pp entries (was in as role spare)

END NON-INTERRUPT STEPS (Once the system switch port is disabled, it stops it from calling in and introducing issues from not being active in puppet or having public dns entries. The remainder should be done in a timely fashion, but don't have to happen without interruption.

  • - system disks wiped (by on-site dc ops)
  • - system unracked, racktables updated
  • - remove system switch port configuration
  • - system mgmt dns entries removed

After all steps have been taken and system is fully decommissioned and removed from the rack, this can be resolved.

Related Objects

StatusSubtypeAssignedTask
ResolvedOttomata
Resolved Cmjohnson

Event Timeline

Change 332470 had a related patch set uploaded (by Dzahn):
install_server: remove stat1001

https://gerrit.wikimedia.org/r/332470

Change 332472 had a related patch set uploaded (by Dzahn):
remove stat1001, keep mgmt

https://gerrit.wikimedia.org/r/332472

Change 332470 merged by Dzahn:
install_server: remove stat1001

https://gerrit.wikimedia.org/r/332470

Dzahn triaged this task as Medium priority.Jan 17 2017, 3:00 PM
RobH renamed this task from Reclaim/Decommission (specify) stat1001 to decommission stat1001.Jan 17 2017, 5:27 PM
RobH updated the task description. (Show Details)

Change 332472 merged by RobH:
remove stat1001, keep mgmt

https://gerrit.wikimedia.org/r/332472

RobH updated the task description. (Show Details)
RobH moved this task from Backlog to Reclaim (Spares/Decommission) on the hardware-requests board.
Cmjohnson lowered the priority of this task from Medium to Low.Jan 25 2017, 6:26 PM

Change 336802 had a related patch set uploaded (by Filippo Giunchedi):
hieradata: remove stat1001

https://gerrit.wikimedia.org/r/336802

Change 336802 merged by Dzahn:
hieradata: remove stat1001

https://gerrit.wikimedia.org/r/336802