This task will track the decommissioning of lvs400[1-4].ulsfo.wmnet. All four of these hosts are well out of warranty, and lvs400[567] have been purchased to replace them.
lvs400[567]'s setup is tracked via T178436, and they are ready to be placed into service.
This task is currently assigned to @bblack, since I've (@robh) been working directly with him with the decom/replacements in ulsfo. This may be retasked to others in #traffic as needed.
lvs4001:
[x] - all system services confirmed offline from production use
[x] - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
[x] - remove system from all lvs/pybal active configuration
[x] - any service group puppet/hiera/dsh config removed
[x] - remove site.pp (replace with role::spare if system isn't shut down immediately during this process.)
START NON-INTERRUPPTABLE STEPS
[x] - disable puppet on host (hosts were powered down and unracked before this step)
[x] - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/366037
[x] - power down host (host is not cabled up, so it cannot power up)
[x] - disable switch port (port was never set back up in new racks, so its disabled)
[x] - switch port assignment noted on this task (for later removal)
[x] - remove production dns entries
[x] - puppet node clean, puppet node deactivate, salt key removed
END NON-INTERRUPPTABLE STEPS
[] - system disks wiped (by onsite)
[x] - swapped places with new systems as needed, and now resides in rack with no cabling.
[] - mgmt dns entries removed. (systems are in rack, but with no power/network/mgmt connections, due to there being no storage in ulsfo and the office has no storage for us during the relocation.)
The remainder cannot happen until we are done with ALL the old CP/LVS systems to unrack them in a batch.
[] - system unracked and decommissioned (by onsite), update racktables with result
[] - switch port configration removed from switch once system is unracked.
lvs4002:
[x] - all system services confirmed offline from production use
[x] - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
[x] - remove system from all lvs/pybal active configuration
[x] - any service group puppet/hiera/dsh config removed
[x] - remove site.pp (replace with role::spare if system isn't shut down immediately during this process.)
START NON-INTERRUPPTABLE STEPS
[x] - disable puppet on host (hosts were powered down and unracked before this step)
[x] - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/366037
[x] - power down host (host is not cabled up, so it cannot power up)
[x] - disable switch port (port was never set back up in new racks, so its disabled)
[x] - switch port assignment noted on this task (for later removal)
[x] - remove production dns entries
[x] - puppet node clean, puppet node deactivate, salt key removed
END NON-INTERRUPPTABLE STEPS
[] - system disks wiped (by onsite)
[x] - swapped places with new systems as needed, and now resides in rack with no cabling.
[] - mgmt dns entries removed. (systems are in rack, but with no power/network/mgmt connections, due to there being no storage in ulsfo and the office has no storage for us during the relocation.)
The remainder cannot happen until we are done with ALL the old CP/LVS systems to unrack them in a batch.
[] - system unracked and decommissioned (by onsite), update racktables with result
[] - switch port configration removed from switch once system is unracked.
lvs4003:
[x] - all system services confirmed offline from production use
[x] - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
[x] - remove system from all lvs/pybal active configuration
[x] - any service group puppet/hiera/dsh config removed
[x] - remove site.pp (replace with role::spare if system isn't shut down immediately during this process.)
START NON-INTERRUPPTABLE STEPS
[x] - disable puppet on host (hosts were powered down and unracked before this step)
[x] - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/366037
[x] - power down host (host is not cabled up, so it cannot power up)
[x] - disable switch port (port was never set back up in new racks, so its disabled)
[x] - switch port assignment noted on this task (for later removal)
[x] - remove production dns entries
[x] - puppet node clean, puppet node deactivate, salt key removed
END NON-INTERRUPPTABLE STEPS
[] - system disks wiped (by onsite)
[x] - swapped places with new systems as needed, and now resides in rack with no cabling.
[] - mgmt dns entries removed. (systems are in rack, but with no power/network/mgmt connections, due to there being no storage in ulsfo and the office has no storage for us during the relocation.)
The remainder cannot happen until we are done with ALL the old CP/LVS systems to unrack them in a batch.
[] - system unracked and decommissioned (by onsite), update racktables with result
[] - switch port configration removed from switch once system is unracked.
lvs4004:
[x] - all system services confirmed offline from production use
[x] - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
[x] - remove system from all lvs/pybal active configuration
[x] - any service group puppet/hiera/dsh config removed
[x] - remove site.pp (replace with role::spare if system isn't shut down immediately during this process.)
START NON-INTERRUPPTABLE STEPS
[x] - disable puppet on host (hosts were powered down and unracked before this step)
[x] - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/366037
[x] - power down host (host is not cabled up, so it cannot power up)
[x] - disable switch port (port was never set back up in new racks, so its disabled)
[x] - switch port assignment noted on this task (for later removal)
[x] - remove production dns entries
[x] - puppet node clean, puppet node deactivate, salt key removed
END NON-INTERRUPPTABLE STEPS
[] - system disks wiped (by onsite)
[x] - swapped places with new systems as needed, and now resides in rack with no cabling.
[] - mgmt dns entries removed. (systems are in rack, but with no power/network/mgmt connections, due to there being no storage in ulsfo and the office has no storage for us during the relocation.)
The remainder cannot happen until we are done with ALL the old CP/LVS systems to unrack them in a batch.
[] - system unracked and decommissioned (by onsite), update racktables with result
[] - switch port configration removed from switch once system is unracked.