Page MenuHomePhabricator

Q2:rack/setup/install/decom eqsin: unified decommission task
Closed, ResolvedPublic

Description

This task will track the decommission-hardware of all old servers at eqsin.

cp

cp5001

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5002

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5003

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5004

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5005

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5006

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5007

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5008

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5009

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5010

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5011

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5012

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5013

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5014

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5015

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5016

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

ganeti

ganeti5001

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

ganeti5002

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

ganeti5003

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

lvs

lvs5001

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

lvs5002

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

lvs5003

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

dns

dns5001

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

dns5002

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

Details

ProjectBranchLines +/-Subject
operations/puppetproduction+0 -1
operations/homer/publicmaster+0 -1
operations/puppetproduction+3 -8
operations/homer/publicmaster+0 -1
operations/puppetproduction+2 -0
operations/puppetproduction+2 -7
operations/puppetproduction+1 -0
operations/puppetproduction+0 -12
operations/homer/publicmaster+0 -1
operations/dnsmaster+1 -1
operations/puppetproduction+1 -5
operations/puppetproduction+1 -5
operations/puppetproduction+2 -10
operations/puppetproduction+2 -10
operations/homer/publicmaster+0 -1
operations/puppetproduction+2 -7
operations/puppetproduction+1 -0
operations/puppetproduction+0 -5
operations/homer/publicmaster+0 -1
operations/dnsmaster+1 -1
operations/puppetproduction+1 -5
operations/puppetproduction+2 -10
operations/puppetproduction+2 -10
operations/puppetproduction+2 -10
operations/puppetproduction+2 -11
Show related patches Customize query in gerrit

Related Objects

StatusSubtypeAssignedTask
Resolvedssingh
ResolvedRobH

Event Timeline

ssingh triaged this task as Medium priority.Nov 25 2022, 3:26 PM
ssingh updated the task description. (Show Details)
ssingh updated the task description. (Show Details)

Change 861439 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5002, cp5007: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861439

Change 861440 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5003, cp5008: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861440

Change 861441 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5004, cp5009: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861441

Change 861442 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5005, cp5010: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861442

Change 861443 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5006: decommission host (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861443

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: cp[5002,5007].eqsin.wmnet

  • cp5002.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • cp5007.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 861439 merged by Ssingh:

[operations/puppet@production] cp5002, cp5007: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861439

Change 861440 merged by Ssingh:

[operations/puppet@production] cp5003, cp5008: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861440

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: cp[5003,5008].eqsin.wmnet

  • cp5003.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • cp5008.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 861441 merged by Ssingh:

[operations/puppet@production] cp5004, cp5009: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861441

Change 862316 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] hiera: decommission dns5001

https://gerrit.wikimedia.org/r/862316

Change 862318 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/dns@master] ntp/eqsin: move to dns5002

https://gerrit.wikimedia.org/r/862318

Change 862321 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/homer/public@master] sites.yaml: remove dns5001 from anycast_neighbors

https://gerrit.wikimedia.org/r/862321

Change 862318 merged by Ssingh:

[operations/dns@master] ntp/eqsin: move to dns5002

https://gerrit.wikimedia.org/r/862318

Mentioned in SAL (#wikimedia-operations) [2022-11-30T19:37:58Z] <sukhe> running authdns-update for Gerrit: 862318 (T323830)

Change 862946 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] lvs5001: set profile::pybal::bgp to no

https://gerrit.wikimedia.org/r/862946

Change 862321 merged by jenkins-bot:

[operations/homer/public@master] sites.yaml: remove dns5001 from anycast_neighbors

https://gerrit.wikimedia.org/r/862321

Change 862316 merged by Ssingh:

[operations/puppet@production] hiera: decommission dns5001

https://gerrit.wikimedia.org/r/862316

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: dns5001.wikimedia.org

  • dns5001.wikimedia.org (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 862946 merged by Ssingh:

[operations/puppet@production] lvs5001: set profile::pybal::bgp to no

https://gerrit.wikimedia.org/r/862946

Change 863382 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] lvs5004: set as high-traffic1 primary LVS and remove lvs4006 (decomm)

https://gerrit.wikimedia.org/r/863382

Change 863383 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/homer/public@master] sites.yaml: remove decommissioned host lvs5001

https://gerrit.wikimedia.org/r/863383

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: lvs5001.eqsin.wmnet

  • lvs5001.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 863383 merged by Ssingh:

[operations/homer/public@master] sites.yaml: remove decommissioned host lvs5001

https://gerrit.wikimedia.org/r/863383

Change 863382 merged by Ssingh:

[operations/puppet@production] lvs5004: set as high-traffic1 primary LVS and remove lvs5001 (decomm)

https://gerrit.wikimedia.org/r/863382

Change 864771 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5011, cp5013: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864771

Change 864771 merged by Ssingh:

[operations/puppet@production] cp5011, cp5013: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864771

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: cp[5011,5013].eqsin.wmnet

  • cp5011.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • cp5013.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 864775 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5012, cp5014: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864775

Change 864775 merged by Ssingh:

[operations/puppet@production] cp5012, cp5014: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864775

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: cp[5012,5014].eqsin.wmnet

  • cp5012.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • cp5014.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 864785 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5015: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864785

Change 864785 merged by Ssingh:

[operations/puppet@production] cp5015: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864785

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: cp5015.eqsin.wmnet

  • cp5015.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 864789 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5016: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864789

Change 864789 merged by Ssingh:

[operations/puppet@production] cp5016: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864789

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: cp5016.eqsin.wmnet

  • cp5016.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 865605 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/dns@master] ntp/eqsin: move to dns5004

https://gerrit.wikimedia.org/r/865605

Change 865605 merged by Ssingh:

[operations/dns@master] ntp/eqsin: move to dns5004

https://gerrit.wikimedia.org/r/865605

Change 865610 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] hiera: decommission dns5002

https://gerrit.wikimedia.org/r/865610

Change 865611 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/homer/public@master] sites.yaml: remove dns5002 from anycast_neighbors

https://gerrit.wikimedia.org/r/865611

Change 865611 merged by jenkins-bot:

[operations/homer/public@master] sites.yaml: remove dns5002 from anycast_neighbors

https://gerrit.wikimedia.org/r/865611

Change 865610 merged by Ssingh:

[operations/puppet@production] hiera: decommission dns5002

https://gerrit.wikimedia.org/r/865610

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: dns5002.wikimedia.org

  • dns5002.wikimedia.org (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 865687 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] lvs5002: set profile::pybal::bgp to no

https://gerrit.wikimedia.org/r/865687

Change 865687 merged by Ssingh:

[operations/puppet@production] lvs5002: set profile::pybal::bgp to no

https://gerrit.wikimedia.org/r/865687

Change 865701 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] lvs5005: set as high-traffic2 primary LVS and remove lvs5002 (decomm)

https://gerrit.wikimedia.org/r/865701

Change 865712 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/homer/public@master] sites.yaml: remove decommissioned host lvs5002

https://gerrit.wikimedia.org/r/865712

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: lvs5002.eqsin.wmnet

  • lvs5002.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 865712 merged by Ssingh:

[operations/homer/public@master] sites.yaml: remove decommissioned host lvs5002

https://gerrit.wikimedia.org/r/865712

Change 865701 merged by Ssingh:

[operations/puppet@production] lvs5005: set as high-traffic2 primary LVS and remove lvs5002 (decomm)

https://gerrit.wikimedia.org/r/865701

Change 865720 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] hiera: lvs5003: bump bgp_med to 150

https://gerrit.wikimedia.org/r/865720

Change 865720 merged by Ssingh:

[operations/puppet@production] hiera: lvs5003: bump bgp_med to 150

https://gerrit.wikimedia.org/r/865720

Change 865742 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] lvs5006: set as secondary LVS and remove lvs5003 (decomm)

https://gerrit.wikimedia.org/r/865742

Change 865773 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/homer/public@master] sites.yaml: remove decommissioned host lvs5003

https://gerrit.wikimedia.org/r/865773

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: lvs5003.eqsin.wmnet

  • lvs5003.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 865742 merged by Ssingh:

[operations/puppet@production] lvs5006: set as secondary LVS and remove lvs5003 (decomm)

https://gerrit.wikimedia.org/r/865742

Change 865773 merged by Ssingh:

[operations/homer/public@master] sites.yaml: remove decommissioned host lvs5003

https://gerrit.wikimedia.org/r/865773

Change 866440 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] install_server: remove obsolete cp hosts partman config

https://gerrit.wikimedia.org/r/866440

Change 866440 merged by Ssingh:

[operations/puppet@production] install_server: remove obsolete cp hosts partman config

https://gerrit.wikimedia.org/r/866440

RobH claimed this task.
RobH added a subscriber: RobH.

confirmed all servers on this task are indeed decommissioned in netbox, removed the cable assignments and updated dns for the removal of mgmt interface assignments.