Page MenuHomePhabricator

Decom graphite2002 & return server to spares pool
Closed, ResolvedPublic

Description

This will track the decommission of graphite2002 and the return of the host to the spares pool (as its less than 4 years old.)

This checklist is able to be copied and pasted into phabricator hardware request tasks for reclaiming systems to spare or decom.

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (handled by wmf-decommission-host)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/heira/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - power down host
  • - update status in netbox (planned for spare)
  • - disable switch port & update to asset tag name on switch port description
  • - remove all remaining puppet references (include role::spare)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - remove mgmt dns entries for the hostname, leave the asset tag as this is going to spares
  • - system disks wiped (by onsite)
  • - remove hostname label of 'graphite2002' as this is returning to spares pool
  • - leave host in rack and cabled after disk wipe, as its now a spare pool system.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 23 2018, 3:43 PM
fgiunchedi moved this task from Backlog to Up next on the observability board.Oct 15 2018, 2:38 PM
RobH triaged this task as Normal priority.Dec 12 2018, 4:36 PM
RobH updated the task description. (Show Details)

Please don't proceed with decom for now; I'm using graphite2002 for some buster tests.

Dzahn moved this task from Backlog to Decommission on the ops-codfw board.Apr 12 2019, 12:09 AM
MoritzMuehlenhoff updated the task description. (Show Details)

This is no longer needed for buster install tests and now good to decommision.

RobH renamed this task from Decom graphite2002 to Decom graphite2002 & return server to spares pool.Fri, Aug 23, 5:36 PM
RobH updated the task description. (Show Details)

cookbooks.sre.hosts.decommission executed by robh@cumin1001 for hosts: graphite2002.codfw.wmnet

  • graphite2002.codfw.wmnet
    • Removed from Puppet master and PuppetDB
    • Downtimed host on Icinga
    • Downtimed management interface on Icinga
    • Removed from DebMonitor
RobH updated the task description. (Show Details)Fri, Aug 23, 5:40 PM

Change 531957 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom use of graphite2002 hostname

https://gerrit.wikimedia.org/r/531957

Change 531958 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] decom graphite2002 dns use

https://gerrit.wikimedia.org/r/531958

Change 531957 merged by RobH:
[operations/puppet@production] decom use of graphite2002 hostname

https://gerrit.wikimedia.org/r/531957

Change 531958 merged by RobH:
[operations/dns@master] decom graphite2002 dns use

https://gerrit.wikimedia.org/r/531958

RobH reassigned this task from RobH to Papaul.Fri, Aug 23, 5:47 PM
RobH removed a project: Patch-For-Review.
RobH updated the task description. (Show Details)
RobH updated the task description. (Show Details)
RobH added subscribers: Papaul, RobH.

@Papaul,

This is ready for disk wipe, hostname label removal, and then to be returned to the spares pool. Please note I've modified the above checklist since this isn't a full decommission and disposal. Once the disks are wiped and the hostname label removed, this can be resolved.

Thanks in advance!

Papaul updated the task description. (Show Details)Mon, Sep 9, 2:58 PM

Change 535211 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/dns@master] DNS: Remove DNS mgmt asset tag WMF6403

https://gerrit.wikimedia.org/r/535211

Change 535211 merged by Dzahn:
[operations/dns@master] DNS: Remove DNS mgmt asset tag WMF6403

https://gerrit.wikimedia.org/r/535211

Papaul closed this task as Resolved.Wed, Sep 11, 4:30 PM
Papaul updated the task description. (Show Details)

The server was used to replace the old mw2232 see T232126