Page MenuHomePhabricator

Decommission baham
Closed, ResolvedPublic

Description

baham has been replaced by authdns2001 (T196664). It's time to decommission it

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/heira/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - power down host
  • - disable switch port
  • - set to inventory status in netbox
  • - switch port assignment noted on this task (for later removal) asw-a-codfw:ge-5/0/14
  • - remove all remaining puppet references (include role::spare)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update netbox with status of 'offline'
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: add system to decommission tracking google sheet
  • - IF DECOM: mgmt dns entries removed.

Event Timeline

Vgutierrez triaged this task as Medium priority.Jul 10 2018, 4:34 PM
Vgutierrez moved this task from Backlog to Hardware on the Traffic board.

Change 445109 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] site: reimage baham as spare server

https://gerrit.wikimedia.org/r/445109

Change 445109 merged by Vgutierrez:
[operations/puppet@production] site: reimage baham as spare server

https://gerrit.wikimedia.org/r/445109

Mentioned in SAL (#wikimedia-operations) [2018-07-11T10:08:08Z] <vgutierrez> reimage baham as spare system - T199247

Script wmf-auto-reimage was launched by vgutierrez on neodymium.eqiad.wmnet for hosts:

baham.wikimedia.org

The log can be found in /var/log/wmf-auto-reimage/201807111008_vgutierrez_12225_baham_wikimedia_org.log.

Completed auto-reimage of hosts:

['baham.wikimedia.org']

and were ALL successful.

wmf-decommission-host was executed by robh for baham.wikimedia.org and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

Change 486703 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom baham

https://gerrit.wikimedia.org/r/486703

Change 486704 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] decom baham prod dns entries

https://gerrit.wikimedia.org/r/486704

Change 486703 merged by RobH:
[operations/puppet@production] decom baham

https://gerrit.wikimedia.org/r/486703

Change 486704 merged by RobH:
[operations/dns@master] decom baham prod dns entries

https://gerrit.wikimedia.org/r/486704

RobH removed projects: Patch-For-Review, Traffic.
RobH updated the task description. (Show Details)
RobH moved this task from Backlog to pending onsite steps (codfw) on the decommission-hardware board.
RobH removed subscribers: ops-monitoring-bot, Stashbot.
RobH subscribed.

ready for onsite wipe and decom steps.

RobH updated the task description. (Show Details)

Change 490102 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/dns@master] DNS: Remove mgmt DNS for baham

https://gerrit.wikimedia.org/r/490102

Change 490102 merged by Dzahn:
[operations/dns@master] DNS: Remove mgmt DNS for baham

https://gerrit.wikimedia.org/r/490102