Page MenuHomePhabricator

Q2:rack/setup/install/decom eqsin: unified decommission task
Closed, ResolvedPublic

Description

This task will track the decommission-hardware of all old servers at eqsin.

cp

cp5001

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5002

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5003

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5004

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5005

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5006

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5007

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5008

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5009

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5010

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5011

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5012

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5013

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5014

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5015

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

cp5016

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

ganeti

ganeti5001

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

ganeti5002

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

ganeti5003

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

lvs

lvs5001

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

lvs5002

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

lvs5003

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

dns

dns5001

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

dns5002

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - mgmt dns entries removed.

Details

SubjectRepoBranchLines +/-
operations/puppetproduction+0 -1
operations/homer/publicmaster+0 -1
operations/puppetproduction+3 -8
operations/homer/publicmaster+0 -1
operations/puppetproduction+2 -0
operations/puppetproduction+2 -7
operations/puppetproduction+1 -0
operations/puppetproduction+0 -12
operations/homer/publicmaster+0 -1
operations/dnsmaster+1 -1
operations/puppetproduction+1 -5
operations/puppetproduction+1 -5
operations/puppetproduction+2 -10
operations/puppetproduction+2 -10
operations/homer/publicmaster+0 -1
operations/puppetproduction+2 -7
operations/puppetproduction+1 -0
operations/puppetproduction+0 -5
operations/homer/publicmaster+0 -1
operations/dnsmaster+1 -1
operations/puppetproduction+1 -5
operations/puppetproduction+2 -10
operations/puppetproduction+2 -10
operations/puppetproduction+2 -10
operations/puppetproduction+2 -11
Show related patches Customize query in gerrit

Related Objects

StatusSubtypeAssignedTask
Resolvedssingh
ResolvedRobH

Event Timeline

ssingh triaged this task as Medium priority.Nov 25 2022, 3:26 PM
ssingh updated the task description. (Show Details)
ssingh updated the task description. (Show Details)

Change 861439 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5002, cp5007: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861439

Change 861440 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5003, cp5008: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861440

Change 861441 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5004, cp5009: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861441

Change 861442 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5005, cp5010: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861442

Change 861443 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5006: decommission host (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861443

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: cp[5002,5007].eqsin.wmnet

  • cp5002.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • cp5007.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 861439 merged by Ssingh:

[operations/puppet@production] cp5002, cp5007: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861439

Change 861440 merged by Ssingh:

[operations/puppet@production] cp5003, cp5008: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861440

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: cp[5003,5008].eqsin.wmnet

  • cp5003.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • cp5008.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 861441 merged by Ssingh:

[operations/puppet@production] cp5004, cp5009: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/861441

Change 862316 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] hiera: decommission dns5001

https://gerrit.wikimedia.org/r/862316

Change 862318 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/dns@master] ntp/eqsin: move to dns5002

https://gerrit.wikimedia.org/r/862318

Change 862321 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/homer/public@master] sites.yaml: remove dns5001 from anycast_neighbors

https://gerrit.wikimedia.org/r/862321

Change 862318 merged by Ssingh:

[operations/dns@master] ntp/eqsin: move to dns5002

https://gerrit.wikimedia.org/r/862318

Mentioned in SAL (#wikimedia-operations) [2022-11-30T19:37:58Z] <sukhe> running authdns-update for Gerrit: 862318 (T323830)

Change 862946 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] lvs5001: set profile::pybal::bgp to no

https://gerrit.wikimedia.org/r/862946

Change 862321 merged by jenkins-bot:

[operations/homer/public@master] sites.yaml: remove dns5001 from anycast_neighbors

https://gerrit.wikimedia.org/r/862321

Change 862316 merged by Ssingh:

[operations/puppet@production] hiera: decommission dns5001

https://gerrit.wikimedia.org/r/862316

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: dns5001.wikimedia.org

  • dns5001.wikimedia.org (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 862946 merged by Ssingh:

[operations/puppet@production] lvs5001: set profile::pybal::bgp to no

https://gerrit.wikimedia.org/r/862946

Change 863382 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] lvs5004: set as high-traffic1 primary LVS and remove lvs4006 (decomm)

https://gerrit.wikimedia.org/r/863382

Change 863383 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/homer/public@master] sites.yaml: remove decommissioned host lvs5001

https://gerrit.wikimedia.org/r/863383

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: lvs5001.eqsin.wmnet

  • lvs5001.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 863383 merged by Ssingh:

[operations/homer/public@master] sites.yaml: remove decommissioned host lvs5001

https://gerrit.wikimedia.org/r/863383

Change 863382 merged by Ssingh:

[operations/puppet@production] lvs5004: set as high-traffic1 primary LVS and remove lvs5001 (decomm)

https://gerrit.wikimedia.org/r/863382

Change 864771 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5011, cp5013: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864771

Change 864771 merged by Ssingh:

[operations/puppet@production] cp5011, cp5013: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864771

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: cp[5011,5013].eqsin.wmnet

  • cp5011.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • cp5013.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 864775 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5012, cp5014: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864775

Change 864775 merged by Ssingh:

[operations/puppet@production] cp5012, cp5014: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864775

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: cp[5012,5014].eqsin.wmnet

  • cp5012.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • cp5014.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 864785 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5015: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864785

Change 864785 merged by Ssingh:

[operations/puppet@production] cp5015: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864785

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: cp5015.eqsin.wmnet

  • cp5015.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 864789 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] cp5016: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864789

Change 864789 merged by Ssingh:

[operations/puppet@production] cp5016: decommission hosts (eqsin hardware refresh)

https://gerrit.wikimedia.org/r/864789

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: cp5016.eqsin.wmnet

  • cp5016.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 865605 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/dns@master] ntp/eqsin: move to dns5004

https://gerrit.wikimedia.org/r/865605

Change 865605 merged by Ssingh:

[operations/dns@master] ntp/eqsin: move to dns5004

https://gerrit.wikimedia.org/r/865605

Change 865610 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] hiera: decommission dns5002

https://gerrit.wikimedia.org/r/865610

Change 865611 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/homer/public@master] sites.yaml: remove dns5002 from anycast_neighbors

https://gerrit.wikimedia.org/r/865611

Change 865611 merged by jenkins-bot:

[operations/homer/public@master] sites.yaml: remove dns5002 from anycast_neighbors

https://gerrit.wikimedia.org/r/865611

Change 865610 merged by Ssingh:

[operations/puppet@production] hiera: decommission dns5002

https://gerrit.wikimedia.org/r/865610

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: dns5002.wikimedia.org

  • dns5002.wikimedia.org (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 865687 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] lvs5002: set profile::pybal::bgp to no

https://gerrit.wikimedia.org/r/865687

Change 865687 merged by Ssingh:

[operations/puppet@production] lvs5002: set profile::pybal::bgp to no

https://gerrit.wikimedia.org/r/865687

Change 865701 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] lvs5005: set as high-traffic2 primary LVS and remove lvs5002 (decomm)

https://gerrit.wikimedia.org/r/865701

Change 865712 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/homer/public@master] sites.yaml: remove decommissioned host lvs5002

https://gerrit.wikimedia.org/r/865712

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: lvs5002.eqsin.wmnet

  • lvs5002.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 865712 merged by Ssingh:

[operations/homer/public@master] sites.yaml: remove decommissioned host lvs5002

https://gerrit.wikimedia.org/r/865712

Change 865701 merged by Ssingh:

[operations/puppet@production] lvs5005: set as high-traffic2 primary LVS and remove lvs5002 (decomm)

https://gerrit.wikimedia.org/r/865701

Change 865720 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] hiera: lvs5003: bump bgp_med to 150

https://gerrit.wikimedia.org/r/865720

Change 865720 merged by Ssingh:

[operations/puppet@production] hiera: lvs5003: bump bgp_med to 150

https://gerrit.wikimedia.org/r/865720

Change 865742 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] lvs5006: set as secondary LVS and remove lvs5003 (decomm)

https://gerrit.wikimedia.org/r/865742

Change 865773 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/homer/public@master] sites.yaml: remove decommissioned host lvs5003

https://gerrit.wikimedia.org/r/865773

cookbooks.sre.hosts.decommission executed by sukhe@cumin2002 for hosts: lvs5003.eqsin.wmnet

  • lvs5003.eqsin.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 865742 merged by Ssingh:

[operations/puppet@production] lvs5006: set as secondary LVS and remove lvs5003 (decomm)

https://gerrit.wikimedia.org/r/865742

Change 865773 merged by Ssingh:

[operations/homer/public@master] sites.yaml: remove decommissioned host lvs5003

https://gerrit.wikimedia.org/r/865773

Change 866440 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] install_server: remove obsolete cp hosts partman config

https://gerrit.wikimedia.org/r/866440

Change 866440 merged by Ssingh:

[operations/puppet@production] install_server: remove obsolete cp hosts partman config

https://gerrit.wikimedia.org/r/866440

RobH claimed this task.
RobH subscribed.

confirmed all servers on this task are indeed decommissioned in netbox, removed the cable assignments and updated dns for the removal of mgmt interface assignments.