Page MenuHomePhabricator

decom radium
Closed, ResolvedPublic

Description

radium has been replaced by torrelay1001 in the parent task T196701

this is to decom radium

waiting a few days before starting on this. currently the data is still available in /var/lib/to in case we want to revert


checklist copied from https://phabricator.wikimedia.org/P7432

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - power down host
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare)
  • - remove production dns entries
  • - puppet node clean, puppet node deactivate
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
  • - IF DECOM: switch port configration removed from switch once system is unracked.
  • - IF DECOM: add system to decommission tracking google sheet
  • - IF DECOM: mgmt dns entries removed.

Details

Related Gerrit Patches:
operations/dns : masterDNS: Remove mgmt DNS for radium,db1069,db1072 and db1073
operations/puppet : productiondecom radium puppet repo entries
operations/dns : masterdecom radium prod dns
operations/puppet : productiondecom radon
operations/puppet : productionremove hosts/radium.yaml from Hiera
operations/puppet : productionsite: turn radium into a spare system

Event Timeline

Dzahn triaged this task as Medium priority.Sep 8 2018, 12:57 AM
Dzahn created this task.
Restricted Application removed a project: Patch-For-Review. · View Herald TranscriptSep 8 2018, 12:57 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Dzahn updated the task description. (Show Details)Sep 8 2018, 12:57 AM

Change 458946 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site: turn radium into a spare system

https://gerrit.wikimedia.org/r/458946

Dzahn added a comment.Sep 8 2018, 1:00 AM

turning it into a spare::system already to remove unused Icinga monitoring, stop the rsync service via puppet etc

Dzahn changed the task status from Open to Stalled.Sep 8 2018, 1:00 AM

Change 458946 merged by Dzahn:
[operations/puppet@production] site: turn radium into a spare system

https://gerrit.wikimedia.org/r/458946

Change 459878 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] remove hosts/radium.yaml from Hiera

https://gerrit.wikimedia.org/r/459878

Change 459878 merged by Dzahn:
[operations/puppet@production] remove hosts/radium.yaml from Hiera

https://gerrit.wikimedia.org/r/459878

Dzahn changed the task status from Stalled to Open.Sep 13 2018, 12:59 AM
Dzahn removed projects: Patch-For-Review, Tor.
Dzahn changed Risk Rating from N/A to default.
Dzahn updated the task description. (Show Details)Sep 13 2018, 8:11 PM
Dzahn added a project: decommission.
Dzahn updated the task description. (Show Details)Sep 13 2018, 8:24 PM
Dzahn removed Dzahn as the assignee of this task.Sep 13 2018, 8:32 PM

wmf-decommission-host was executed by robh for radon.wikimedia.org and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor
RobH added a comment.Sep 18 2018, 9:33 PM

radon network port asw2-c-eqiad:ge-4/0/25

Change 461226 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom radon

https://gerrit.wikimedia.org/r/461226

Change 461226 merged by RobH:
[operations/puppet@production] decom radon

https://gerrit.wikimedia.org/r/461226

RobH assigned this task to Cmjohnson.Sep 18 2018, 9:38 PM
RobH edited projects, added ops-eqiad; removed Patch-For-Review.
RobH updated the task description. (Show Details)
Dzahn added a comment.EditedSep 18 2018, 9:38 PM

@RobH This ticket is about radium but there is also a decom ticket for radon at the same time at T202040. I think they got mixed up above. Just wanted to let you know.

RobH moved this task from Backlog to Decommission on the ops-eqiad board.
RobH claimed this task.Sep 18 2018, 9:41 PM
RobH added a subscriber: Cmjohnson.

Please note I did indeed swap references around, all the entries for radon should have gone to T202040 so stealing this back for its radium decom.

RobH updated the task description. (Show Details)Sep 18 2018, 9:41 PM

wmf-decommission-host was executed by robh for radium.wikimedia.org and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor
RobH added a comment.Sep 18 2018, 9:54 PM

radium network port is asw-a-eqiad:ge-3/0/0

RobH updated the task description. (Show Details)Sep 18 2018, 9:54 PM

Change 461234 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] decom radium prod dns

https://gerrit.wikimedia.org/r/461234

Change 461235 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom radium puppet repo entries

https://gerrit.wikimedia.org/r/461235

Change 461234 merged by RobH:
[operations/dns@master] decom radium prod dns

https://gerrit.wikimedia.org/r/461234

Change 461235 merged by RobH:
[operations/puppet@production] decom radium puppet repo entries

https://gerrit.wikimedia.org/r/461235

RobH reassigned this task from RobH to Cmjohnson.Sep 18 2018, 9:59 PM

Ok, this is now all set, radium is ready for onsite steps for decom.

Jclark-ctr updated the task description. (Show Details)Nov 28 2019, 12:05 AM
Papaul updated the task description. (Show Details)Dec 6 2019, 4:30 PM

Change 555542 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/dns@master] DNS: Remove mgmt DNS for radium,db1069,db1072 and db1073

https://gerrit.wikimedia.org/r/555542

Change 555542 merged by Papaul:
[operations/dns@master] DNS: Remove mgmt DNS for radium,db1069,db1072 and db1073

https://gerrit.wikimedia.org/r/555542

Papaul closed this task as Resolved.Dec 6 2019, 4:53 PM
Papaul updated the task description. (Show Details)
Papaul added a subscriber: Papaul.

complete