These two machines are now unused and replaced by graphite1004, they can be returned to spares pool at the beginning of December (leaving some time to come back to them if graphite1004 for some reason doesn't work).
This task will track the #decommission of servers graphite1001 & graphite1003. Both of these systems are out of warranty and need to have their return to spares or disposal approved. Their warranties expired in January and February of 2018, so they were purchased in 2015. They are now both over 4 years old. @robh coordinated with @faidon via irc on 2019-02-14 and confirmed we should decommission these.
The first 5 steps should be completed by the service owner that is returning the server to DC-ops (for reclaim to spare or decommissioning, dependent on server configuration and age.)
graphite1001:
Steps for service owner:
[x] - all system services confirmed offline from production use
[x] - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
[x] - remove system from all lvs/pybal active configuration
[x] - any service group puppet/hiera/dsh config removed
[x] - remove site.pp (replace with role::spare::system if system isn't shut down immediately during this process.)
[x] - unassign service owner from this task, check off completed steps, and assign to @robh for followup on below steps.
Steps for DC-Ops:
The following steps cannot be interrupted, as it will leave the system in an unfinished state.
**Start non-interrupt steps:**
[x] - disable puppet on host
[x] - power down host
[x] - update netbox status to Inventory (if decom) or Planned (if spare)
[x] - disable switch port
[x] - switch port assignment noted on this task (for later removal) asw2-c-eqiad:ge-4/0/6
[] - remove all remaining puppet references (include role::spare)
[] - remove production dns entries
[] - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
[] - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)
**End non-interrupt steps.**
[] - system disks wiped (by onsite)
[] - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
[] - IF DECOM: switch port configration removed from switch once system is unracked.
[] - IF DECOM: add system to decommission tracking google sheet
[] - IF DECOM: mgmt dns entries removed.
graphite1003:
Steps for service owner:
[x] - all system services confirmed offline from production use
[x] - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
[x] - remove system from all lvs/pybal active configuration
[x] - any service group puppet/hiera/dsh config removed
[x] - remove site.pp (replace with role::spare::system if system isn't shut down immediately during this process.)
[x] - unassign service owner from this task, check off completed steps, and assign to @robh for followup on below steps.
Steps for DC-Ops:
The following steps cannot be interrupted, as it will leave the system in an unfinished state.
**Start non-interrupt steps:**
[x] - disable puppet on host
[x] - power down host
[x] - update netbox status to Inventory (if decom) or Planned (if spare)
[x] - disable switch port
[x] - switch port assignment noted on this task (for later removal) asw-a-eqiad:ge-3/0/15
[] - remove all remaining puppet references (include role::spare)
[] - remove production dns entries
[] - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
[] - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)
**End non-interrupt steps.**
[] - system disks wiped (by onsite)
[] - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
[] - IF DECOM: switch port configration removed from switch once system is unracked.
[] - IF DECOM: add system to decommission tracking google sheet
[] - IF DECOM: mgmt dns entries removed.