Page MenuHomePhabricator

Decommission elastic2001-2024
Closed, ResolvedPublic

Description

Please note these are due back for lease return December 2018. The decommission of these is high priority.

Once elastic2037-2054 (T210450) are configured, we can start removing those old servers. See Server Lifecycle for details.

Steps:

  • ban the servers from the cluster
  • wait for all shards to relocate
  • follow the steps in Server Lifecycle (including adding a checklist for each server in this task).

elastic2001

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2002

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2003

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2004

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2005

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2006

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2007

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2008

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2009

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2010

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2011

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2012

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2013

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2014

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2015

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2016

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2017

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2018

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2019

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2020

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2021

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2022

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2023

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

elastic2024

Decommission Checklist

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp (replace with role(spare::system) if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS - please assign to @RobH for the non-interrupt steps

  • - disable puppet on host
  • - power down host
  • - update netbox status to Inventory (if decom) or Planned (if spare)
  • - disable switch port
  • - switch port assignment noted on this task (for later removal)
  • - remove all remaining puppet references (include role::spare) https://gerrit.wikimedia.org/r/478105
  • - remove production dns entries https://gerrit.wikimedia.org/r/478106
  • - puppet node clean, puppet node deactivate (handled by wmf-decommission-host)
  • - remove dbmonitor entries on neodymium/sarin: sudo curl -X DELETE https://debmonitor.discovery.wmnet/hosts/${HOST_FQDN} --cert /etc/debmonitor/ssl/cert.pem --key /etc/debmonitor/ssl/server.key (handled by wmf-decommission-host)

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite) - Please note these are SSD systems, and must be wiped using the hdparm utility.
  • - system unracked and decommissioned (by onsite), update netbox with result
  • - switch port configration removed from switch once system is unracked.
  • - add system to decommission tracking google sheet
  • - mgmt dns entries removed.
  • - update @RobH when all elastic are done so we can move forward with lease return.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jijiki triaged this task as Normal priority.Dec 4 2018, 10:52 PM

Change 477761 had a related patch set uploaded (by Mathew.onipe; owner: Mathew.onipe):
[operations/puppet@production] elasticsearch: Remove elastic2001-elastic2024 from codfw cluster

https://gerrit.wikimedia.org/r/477761

Mathew.onipe updated the task description. (Show Details)Dec 6 2018, 10:02 AM
Mathew.onipe added a subscriber: RobH.
Mathew.onipe updated the task description. (Show Details)Dec 6 2018, 10:10 AM
Mathew.onipe updated the task description. (Show Details)
Mathew.onipe updated the task description. (Show Details)Dec 6 2018, 10:14 AM

Mentioned in SAL (#wikimedia-operations) [2018-12-06T12:56:36Z] <gehel> depooling and shutting down elasticsearch on elastic2001-2024 - T211023

Change 477997 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] elasticsearch: move master eligible nodes to new servers

https://gerrit.wikimedia.org/r/477997

Change 477997 merged by Gehel:
[operations/puppet@production] elasticsearch: move master eligible nodes to new servers

https://gerrit.wikimedia.org/r/477997

Mentioned in SAL (#wikimedia-operations) [2018-12-06T15:08:50Z] <gehel> restartign new elasticsearch masters on codfw - T211023

Mentioned in SAL (#wikimedia-operations) [2018-12-06T16:17:27Z] <gehel> shutting down elasticsearch on elastic2001-2024 (second try) - T211023

Change 478029 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] elasticsearch: purge main elasticsearch configuration directory

https://gerrit.wikimedia.org/r/478029

Gehel added a comment.Dec 6 2018, 5:02 PM

We need to reassign some nodes between the psi and omega cluster, as removing old nodes would leave the clusters unbalanced between rows.

This will require some cleanup:

  • configuration should be cleanup by puppet automatically
  • service elasticsearch_5@production-search-{psi|omega}-codfw needs to be removed
  • cron elasticsearch-production-search-{psi|omega}-codfw-gc-log-cleanup needs to be removed
  • data directory /srv/elasticsearch/production-search-{psi|omega}-codfw/ needs to be removed

Change 478029 merged by Gehel:
[operations/puppet@production] elasticsearch: purge main elasticsearch configuration directory

https://gerrit.wikimedia.org/r/478029

Mentioned in SAL (#wikimedia-operations) [2018-12-06T19:32:53Z] <gehel> shutting down elasticsearch on elastic2001-2024 (third time is a charm) - T211023

Change 477761 merged by Gehel:
[operations/puppet@production] elasticsearch: Remove elastic2001-elastic2024 from codfw cluster

https://gerrit.wikimedia.org/r/477761

Gehel updated the task description. (Show Details)Dec 6 2018, 8:29 PM
Gehel reassigned this task from Gehel to RobH.
Gehel added a subscriber: Papaul.

elastic2001-2024 are ready for decommission. They are taken our of the cluster and can be shutdown whenever you want (cc @Papaul)

wmf-decommission-host was executed by robh for elastic2001.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2002.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor
RobH added a comment.Dec 6 2018, 9:29 PM
This comment was removed by RobH.

wmf-decommission-host was executed by robh for elastic2003.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2004.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2005.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2006.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2007.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2008.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2009.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2010.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2011.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2012.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2013.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2014.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2015.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2016.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2017.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2018.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2019.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2020.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2021.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2022.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2023.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor

wmf-decommission-host was executed by robh for elastic2024.codfw.wmnet and performed the following actions:

  • Revoked Puppet certificate
  • Removed from PuppetDB
  • Downtimed host on Icinga
  • Downtimed mgmt interface on Icinga
  • Removed from DebMonitor
RobH added a comment.EditedDec 6 2018, 9:43 PM

All systems have had puppet disabled and been powered off. I'm working through disabling and listing the network ports now.

asw-a-codfw:ge-5/0/8 elastic2001
asw-a-codfw:ge-5/0/20 elastic2002
asw-a-codfw:ge-5/0/21 elastic2003
asw-a-codfw:ge-8/0/3 elastic2004
asw-a-codfw:ge-8/0/4 elastic2005
asw-a-codfw:ge-8/0/5 elastic2006

All have been disabled, show | compare of the disable commit:

robh@asw-a-codfw# show | compare 
[edit interfaces interface-range vlan-private1-a-codfw]
-    member ge-5/0/8;
[edit interfaces interface-range vlan-private1-a-codfw]
-    member-range ge-5/0/20 to ge-5/0/21;
-    member-range ge-8/0/3 to ge-8/0/5;
[edit interfaces interface-range disabled]
     member ge-5/0/4 { ... }
+    member ge-5/0/8;
+    member ge-5/0/20;
+    member ge-5/0/21;
+    member ge-8/0/3;
+    member ge-8/0/4;
+    member ge-8/0/5;

asw-b-cdofw: ge-5/0/26 elastic2007
asw-b-cdofw: ge-5/0/27 elastic2008
asw-b-cdofw: ge-5/0/28 elastic2009
asw-b-cdofw: ge-8/0/5 elastic2010
asw-b-cdofw: ge-8/0/6 elastic2011
asw-b-cdofw: ge-8/0/7 elastic2012

[edit interfaces interface-range vlan-private1-b-codfw]
-    member-range ge-8/0/5 to ge-8/0/7;
-    member-range ge-5/0/26 to ge-5/0/28;
[edit interfaces interface-range disabled]
     member ge-5/0/35 { ... }
+    member ge-5/0/26;
+    member ge-5/0/27;
+    member ge-5/0/28;
+    member ge-8/0/5;
+    member ge-8/0/6;
+    member ge-8/0/7;

asw-c-codfw: ge-1/0/10 elastic2013
asw-c-codfw: ge-1/0/11 elastic2014
asw-c-codfw: ge-1/0/12 elastic2015
asw-c-codfw: ge-5/0/10 elastic2016
asw-c-codfw: ge-5/0/11 elastic2017
asw-c-codfw: ge-5/0/12 elastic2018

robh@asw-c-codfw# show | compare 
[edit interfaces interface-range vlan-private1-c-codfw]
-    member-range ge-5/0/10 to ge-5/0/12;
-    member-range ge-1/0/10 to ge-1/0/12;
[edit interfaces interface-range disabled]
     member ge-5/0/3 { ... }
+    member ge-1/0/10;
+    member ge-1/0/11;
+    member ge-1/0/12;
+    member ge-5/0/10;
+    member ge-5/0/11;
+    member ge-5/0/12;

asw-d-codfw: ge-1/0/0 elastic2019
asw-d-codfw: ge-1/0/1 elastic2020
asw-d-codfw: ge-1/0/2 elastic2021
asw-d-codfw: ge-5/0/0 elastic2022
asw-d-codfw: ge-5/0/1 elastic2023
asw-d-codfw: ge-5/0/2 elastic2024

robh@asw-d-codfw# show | compare 
[edit interfaces interface-range vlan-private1-d-codfw]
-    member-range ge-5/0/0 to ge-5/0/2;
-    member-range ge-1/0/0 to ge-1/0/2;
[edit interfaces interface-range disabled]
     member ge-2/0/12 { ... }
+    member ge-1/0/0;
+    member ge-1/0/1;
+    member ge-1/0/2;
+    member ge-5/0/0;
+    member ge-5/0/1;
+    member ge-5/0/2;
RobH updated the task description. (Show Details)

Change 478105 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom elastic2001-2024

https://gerrit.wikimedia.org/r/478105

Change 478105 merged by RobH:
[operations/puppet@production] decom elastic2001-2024

https://gerrit.wikimedia.org/r/478105

RobH updated the task description. (Show Details)Dec 6 2018, 10:15 PM

Change 478106 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] decom elastic2001-2024 production dns entries

https://gerrit.wikimedia.org/r/478106

Change 478106 merged by RobH:
[operations/dns@master] decom elastic2001-2024 production dns entries

https://gerrit.wikimedia.org/r/478106

RobH updated the task description. (Show Details)
RobH reassigned this task from RobH to Papaul.
RobH added projects: ops-codfw, decommission.

Ok, these are now ready for SSD wipe. Please note, since they are SSDs, a wipe (write zeros) won't work, and the hdparm utlity must instead be used.

Example hdparm use to secure erase sda:

hdparm -I /dev/sda
hdparm --user-master u --security-set-pass pw /dev/sda
time hdparm --user-master u --security-erase pw /dev/sda
hdparm -I /dev/sda
RobH moved this task from Backlog to High Priority Tasks on the ops-codfw board.
RobH raised the priority of this task from Normal to High.Dec 6 2018, 10:54 PM
RobH updated the task description. (Show Details)
Papaul updated the task description. (Show Details)Dec 11 2018, 4:35 PM
Papaul updated the task description. (Show Details)Dec 11 2018, 5:28 PM
Papaul updated the task description. (Show Details)Dec 12 2018, 8:35 PM
Papaul updated the task description. (Show Details)Dec 12 2018, 8:42 PM
Papaul updated the task description. (Show Details)Dec 13 2018, 3:23 PM
papaul@asw-a-codfw# run show interfaces ge-5/0/8 descriptions 
Interface       Admin Link Description
ge-5/0/8        down  down DISABLED

papaul@asw-a-codfw# run show interfaces ge-5/0/20 descriptions   
Interface       Admin Link Description
ge-5/0/20       down  down DISABLED

papaul@asw-a-codfw# run show interfaces ge-5/0/21 descriptions    
Interface       Admin Link Description
ge-5/0/21       down  down DISABLED

papaul@asw-a-codfw# run show interfaces ge-8/0/3 descriptions 
Interface       Admin Link Description
ge-8/0/3        down  down DISABLED

papaul@asw-a-codfw# run show interfaces ge-8/0/4 descriptions    
Interface       Admin Link Description
ge-8/0/4        down  down DISABLED

papaul@asw-a-codfw# run show interfaces ge-8/0/5 descriptions    
Interface       Admin Link Description
ge-8/0/5        down  down DISABLED

papaul@asw-b-codfw# run show interfaces ge-5/0/26 descriptions 
Interface       Admin Link Description
ge-5/0/26       down  down DISABLED

papaul@asw-b-codfw> show interfaces ge-5/0/27 descriptions    
Interface       Admin Link Description
ge-5/0/27       down  down DISABLED

papaul@asw-b-codfw> show interfaces ge-5/0/28 descriptions    
Interface       Admin Link Description
ge-5/0/28       down  down DISABLED

papaul@asw-b-codfw> show interfaces ge-8/0/5 descriptions     
Interface       Admin Link Description
ge-8/0/5        down  down DISABLED

papaul@asw-b-codfw> show interfaces ge-8/0/6 descriptions    
Interface       Admin Link Description
ge-8/0/6        down  down DISABLED

papaul@asw-b-codfw> show interfaces ge-8/0/7 descriptions    
Interface       Admin Link Description
ge-8/0/7        down  down DISABLED
Papaul updated the task description. (Show Details)Dec 13 2018, 4:06 PM

Before

papaul@asw-c-codfw> show interfaces descriptions | match "ge-1/0/1[0-2]"     
ge-1/0/10       up    down elastic2013
ge-1/0/11       up    down elastic2014
ge-1/0/12       up    down elastic2015

papaul@asw-c-codfw> show interfaces descriptions | match "ge-5/0/1[0-2]"    
ge-5/0/10       up    down elastic2016
ge-5/0/11       up    down elastic2017
ge-5/0/12       up    down elastic2018

After

papaul@asw-c-codfw# run show interfaces descriptions | match "ge-1/0/1[0-2]"   
ge-1/0/10       down  down DISABLED
ge-1/0/11       down  down DISABLED
ge-1/0/12       down  down DISABLED

papaul@asw-c-codfw# run show interfaces descriptions | match "ge-5/0/1[0-2]"    
ge-5/0/10       down  down DISABLED
ge-5/0/11       down  down DISABLED
ge-5/0/12       down  down DISABLED
Papaul updated the task description. (Show Details)Dec 17 2018, 9:14 PM
Papaul updated the task description. (Show Details)Dec 18 2018, 10:28 PM

Change 481017 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/dns@master] DNS: Remove mgmt DNS entries for elastic2001 - elastic2024

https://gerrit.wikimedia.org/r/481017

Change 481017 merged by Dzahn:
[operations/dns@master] DNS: Remove mgmt DNS entries for elastic2001 - elastic2024

https://gerrit.wikimedia.org/r/481017

Papaul updated the task description. (Show Details)Dec 20 2018, 8:06 PM
Papaul updated the task description. (Show Details)Jan 3 2019, 6:40 PM
Papaul reassigned this task from Papaul to RobH.Jan 14 2019, 5:13 PM

This is complete. All servers ready to be ship out.

Gehel closed this task as Resolved.Jan 15 2019, 6:27 PM

Since @Papaul says it is all done, I'll close this. No more need to track it on our side.

RobH reopened this task as Stalled.Jan 15 2019, 6:30 PM

I was keepign lease return decom tasks open for now, since we're uncertain what is happening with them. We have to ship them back to Farnam, but don't know where to ship them yet!

Restricted Application added a project: Operations. · View Herald TranscriptJan 15 2019, 6:30 PM
Gehel changed the task status from Stalled to Open.Jan 15 2019, 6:30 PM

It looks like @RobH still need to track this.

RobH mentioned this in Unknown Object (Task).Mar 19 2019, 12:43 AM
RobH closed this task as Resolved.Mar 28 2019, 9:13 PM