Page MenuHomePhabricator

Decommission esams ms-fe / ms-be
Open, NormalPublic

Description

The esams machines used for swift are old and unused for production purposes, we should decom


  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role::spare if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - remove all remaining puppet references (include role::spare)
  • - power down host
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: mgmt dns entries removed.
  • - IF RECLAIM: system added back to spares tracking (by onsite)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 3 2017, 10:59 AM

Change 362965 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] Decom swift cluster in esams

https://gerrit.wikimedia.org/r/362965

Change 362965 merged by Filippo Giunchedi:
[operations/puppet@production] Decom swift cluster in esams

https://gerrit.wikimedia.org/r/362965

Change 362968 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/dns@master] Decom ms-fe.svc.esams.wmnet

https://gerrit.wikimedia.org/r/362968

Change 362968 merged by Filippo Giunchedi:
[operations/dns@master] Decom ms-fe.svc.esams.wmnet

https://gerrit.wikimedia.org/r/362968

fgiunchedi moved this task from Backlog to Radar on the User-fgiunchedi board.

I've reimaged all ms-be / ms-fe in esams and wiped data disks on the former, left to do is to wipe only the OS disks when the time comes for decom

fgiunchedi triaged this task as Normal priority.Jul 19 2017, 10:09 AM
faidon moved this task from Backlog to Decommission on the ops-esams board.Aug 29 2017, 3:05 PM
Dzahn updated the task description. (Show Details)Jan 9 2018, 6:03 PM
Dzahn updated the task description. (Show Details)
Dzahn updated the task description. (Show Details)

Change 403215 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] decom esams swift machines, rm from puppet/dhcp

https://gerrit.wikimedia.org/r/403215

Change 403216 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] decom esams swift machines, keep mgmt

https://gerrit.wikimedia.org/r/403216

Dzahn updated the task description. (Show Details)Jan 9 2018, 6:21 PM

Change 403215 merged by Dzahn:
[operations/puppet@production] decom esams swift machines, rm from puppet/dhcp

https://gerrit.wikimedia.org/r/403215

Mentioned in SAL (#wikimedia-operations) [2018-01-09T18:42:53Z] <mutante> ms-fe3002,ms-fe3001 - powering down, removing from puppet and icinga, ms-be* removing from puppet/icinga (T169518)

Change 403216 merged by Dzahn:
[operations/dns@master] decom esams swift machines, keep mgmt

https://gerrit.wikimedia.org/r/403216

Dzahn updated the task description. (Show Details)Jan 9 2018, 6:57 PM
Dzahn assigned this task to mark.Jan 9 2018, 7:10 PM
Dzahn added subscribers: mark, Dzahn.

@mark @fgiunchedi They are shutdown and removed from Icinga and DNS now. Only the "disable switch port" part i could not do due to lack of access. I copied the check boxes from the decom template on the wikitech server lifecycle page.

Thanks a lot @Dzahn for taking care of this!

fgiunchedi rescinded a token.
Dzahn removed a subscriber: Dzahn.Jul 24 2018, 9:09 PM