Page MenuHomePhabricator

decom spare server lawrencium/WMF3542
Closed, ResolvedPublic

Description

decom spare server lawrencium/WMF3542, it was already reclaimed to spares on T183343

all that is left is remove from rack, update racktables, remove switch port config


This checklist is able to be copied and pasted into phabricator hardware request tasks for reclaiming systems to spare or decom.

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role::spare::system if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - remove all remaining puppet references (include role::spare)
  • - power down host
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: mgmt dns entries removed.
  • - IF RECLAIM: system added back to spares tracking (by onsite)

Event Timeline

RobH triaged this task as Medium priority.Apr 3 2018, 9:42 PM
RobH created this task.
Cmjohnson moved this task from Backlog to Up next on the ops-eqiad board.Apr 10 2018, 2:00 PM

Change 427445 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] removing dns entries for lawrencium

https://gerrit.wikimedia.org/r/427445

Change 427445 merged by Cmjohnson:
[operations/dns@master] removing dns entries for lawrencium

https://gerrit.wikimedia.org/r/427445

Cmjohnson closed this task as Resolved.Apr 18 2018, 5:19 PM

removed from rack, tracking sheet updated.

ema reopened this task as Open.Apr 19 2018, 8:04 AM
ema added a subscriber: ema.

Re-opening, this morning we had two icinga criticals for lawrencium and lawrencium.mgmt being down. Some decom steps seem to have been skipped.

There are still DNS entries in git:

jmm@korn:~/git/dns$ rgrep lawrenc *
templates/10.in-addr.arpa:94 1H IN PTR lawrencium.eqiad.wmnet.
templates/wmnet:lawrencium 1H IN A 10.64.48.94

It's also still in puppet, BTW:

jmm@sarin:~$ sudo cumin lawr*
1 hosts will be targeted:
lawrencium.eqiad.wmnet
DRY-RUN mode enabled, aborting

Dzahn added a subscriber: Dzahn.Jun 7 2018, 1:37 PM

Ticket needs the templated check boxes from https://wikitech.wikimedia.org/wiki/Server_Lifecycle/reclaim_checklist

copying that into ticket description

Dzahn updated the task description. (Show Details)Jun 7 2018, 1:37 PM

Mentioned in SAL (#wikimedia-operations) [2018-07-18T10:30:24Z] <moritzm> ran puppet clean/deactivate for lawrencium, long-standing source of errors in package deployments (T191360)

RobH removed Cmjohnson as the assignee of this task.Jul 18 2018, 6:09 PM
RobH added a subscriber: Cmjohnson.
Cmjohnson moved this task from Up next to Decommission on the ops-eqiad board.Aug 1 2018, 2:31 PM

Change 452397 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/puppet@production] Removing puppet entries from decom host lawrencium

https://gerrit.wikimedia.org/r/452397

Change 452397 abandoned by Cmjohnson:
Removing puppet entries from decom host lawrencium

https://gerrit.wikimedia.org/r/452397

Change 452401 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/puppet@production] Removing puppet entries decom host lawrencium

https://gerrit.wikimedia.org/r/452401

Change 452401 merged by Cmjohnson:
[operations/puppet@production] Removing puppet entries decom host lawrencium

https://gerrit.wikimedia.org/r/452401

Change 452403 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Removing dns for decom host lawrencium

https://gerrit.wikimedia.org/r/452403

Change 452403 merged by Cmjohnson:
[operations/dns@master] Removing dns for decom host lawrencium

https://gerrit.wikimedia.org/r/452403

Cmjohnson updated the task description. (Show Details)Aug 13 2018, 3:45 PM
Cmjohnson moved this task from Decommission to UnRacking Tasks on the ops-eqiad board.
Cmjohnson closed this task as Resolved.Aug 13 2018, 3:52 PM
Cmjohnson claimed this task.
Cmjohnson updated the task description. (Show Details)