Page MenuHomePhabricator

Decommission cp400[1-4]
Closed, ResolvedPublic

Description

cp4001, cp4002, cp4003, cp4004 are ready for decom. They're switched to role::spare::system in puppet and have been freshly reinstalled in that role (no leftover services possible). These do need secure erase of drives to avoid leaking key material.

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role::spare if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/366037
  • - power down host
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate, salt key removed

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - swapped places with new cp systems, and now reside in rack with no cabling.

The remainder cannot happen until we are done with ALL the old CP systems to unrack them in a batch.

  • - system unracked and decommissioned (by onsite), update racktables with result
  • - switch port configration removed from switch once system is unracked.
  • - mgmt dns entries removed. (systems are in rack, but with no power/network/mgmt connections, due to there being no storage in ulsfo and the office has no storage for us during the relocation.)

Details

Related Gerrit Patches:
operations/dns : mastercp400[1-4] decom, mgmt dns removal
operations/puppet : productiondecom of cp400[1-4]
operations/dns : masterdecom cp400[1-4]

Event Timeline

BBlack created this task.Jun 27 2017, 11:00 PM
RobH claimed this task.Jul 18 2017, 3:54 PM
RobH moved this task from Backlog to Decommission on the ops-ulsfo board.

Mentioned in SAL (#wikimedia-operations) [2017-07-18T19:48:31Z] <robh> starting wipe on cp400[1-4] per T169020

RobH updated the task description. (Show Details)Jul 18 2017, 7:51 PM

Change 366033 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] decom cp400[1-4]

https://gerrit.wikimedia.org/r/366033

Change 366033 merged by RobH:
[operations/dns@master] decom cp400[1-4]

https://gerrit.wikimedia.org/r/366033

RobH updated the task description. (Show Details)Jul 18 2017, 7:56 PM

robh@asw-ulsfo> show interfaces descriptions | grep cp4001
xe-2/0/0 down down cp4001.ulsfo.wmnet

{master:2}
robh@asw-ulsfo> show interfaces descriptions | grep cp4002
xe-2/0/2 down down cp4002.ulsfo.wmnet

{master:2}
robh@asw-ulsfo> show interfaces descriptions | grep cp4003
xe-1/0/1 down down cp4003.ulsfo.wmnet

{master:2}
robh@asw-ulsfo> show interfaces descriptions | grep cp4004
xe-1/0/2 down down cp4004.ulsfo.wmnet

Change 366037 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom of cp400[1-4]

https://gerrit.wikimedia.org/r/366037

Change 366037 merged by RobH:
[operations/puppet@production] decom of cp400[1-4]

https://gerrit.wikimedia.org/r/366037

RobH updated the task description. (Show Details)Jul 18 2017, 8:02 PM

Setting cp400[1-4] to wipe via usb boot. Sicne there are two 250GB disks, the wipe will take overnight. I'll coem back down later this week or next to pull these and put cp402[1-4] in their places.

cp400[234] were not 'puppet node clean' nor 'puppet node deactivate' btw, I've done that now

All of these systems have now been wiped and moved around in the racks in ulsfo. racktables shows their current position, but since wipe and movement in the rack, they dont have power/network/mgmt connections any longer.

RobH changed the task status from Open to Stalled.Aug 4 2017, 3:59 PM
RobH lowered the priority of this task from Medium to Low.
RobH removed a project: Patch-For-Review.
RobH updated the task description. (Show Details)
RobH removed subscribers: gerritbot, Stashbot.

Change 370224 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] cp400[1-4] decom, mgmt dns removal

https://gerrit.wikimedia.org/r/370224

Change 370224 merged by RobH:
[operations/dns@master] cp400[1-4] decom, mgmt dns removal

https://gerrit.wikimedia.org/r/370224

RobH updated the task description. (Show Details)Aug 4 2017, 4:07 PM
RobH removed a project: Patch-For-Review.
RobH removed RobH as the assignee of this task.Dec 14 2017, 7:28 PM
RobH closed this task as Resolved.Jul 18 2018, 5:59 PM
RobH claimed this task.

I'm resolving this, as all the systems have been decommissioned and added to the decommissoined server tracking google sheet. They are still in the rack until wehave a pickup of decom systems for resale later this fiscal.