Page MenuHomePhabricator

Decommission ms-fe100[1-4]
Closed, ResolvedPublic

Description

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/heira/dsh config removed
  • - remove site.pp (replace with role::spare if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - remove all remaining puppet references (include role::spare)
  • - power down host
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate, salt key removed

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: mgmt dns entries removed.
  • - IF RECLAIM: system added back to spares tracking (by onsite)

Event Timeline

Dzahn triaged this task as Medium priority.Mar 28 2017, 12:23 AM

Change 345075 had a related patch set uploaded (by Dzahn):
[operations/puppet@production] decom ms-fe100[1-4], remove from DHCP and puppet

https://gerrit.wikimedia.org/r/345075

Change 345076 had a related patch set uploaded (by Dzahn):
[operations/dns@master] remove production IPs for ms-fe100[1-4]

https://gerrit.wikimedia.org/r/345076

Change 345075 merged by Dzahn:
[operations/puppet@production] decom ms-fe100[1-4], remove from DHCP and puppet

https://gerrit.wikimedia.org/r/345075

Mentioned in SAL (#wikimedia-operations) [2017-03-28T20:06:56Z] <mutante> ms-fe100[1-4] - disable/stop puppet, stop salt minion, decom (T160986)

Mentioned in SAL (#wikimedia-operations) [2017-03-28T20:24:55Z] <mutante> ms-fe1001 thru msfe1004 - scheduled last downtime for host and services in icinga - shutdown -h now, turn them off, revoke puppet certs, salt-keys... (T160986)

Change 345076 merged by Dzahn:
[operations/dns@master] remove production IPs for ms-fe100[1-4]

https://gerrit.wikimedia.org/r/345076

Dzahn added subscribers: RobH, Dzahn.

@RobH all steps done up to switch ports, per checked boxes above, could you disable the ports? thanks

RobH added a subscriber: Cmjohnson.

So, in trying to find these ports, they are not set with a description on the switch. So @Cmjohnson will have to manually trace the ports, disable them on the switch, and note their port assignments on this task.

Then the rest of the checklist can continue.

@RobH ms-fe1001 and 1002 switch ports have been allocated to other servers. ms-ff1003 and 1004 were still labeled but I have since removed them

xe-8/0/13 up down ms-fe1003
xe-8/0/14 up down ms-fe1004

Cmjohnson lowered the priority of this task from Medium to Low.Apr 27 2017, 7:41 PM
Cmjohnson updated the task description. (Show Details)