Page MenuHomePhabricator

decommission miscweb2002.codfw.wmnet / miscweb1002.eqiad.wmnet
Closed, ResolvedPublicRequest

Description

This task will track the decommission-hardware of servers miscweb2002.codfw.wmnet & miscweb1002.eqiad.wmnet

With the launch of updates to the decom cookbook, the majority of these steps can be handled by the service owners directly. The DC Ops team only gets involved once the system has been fully removed from service and powered down by the decommission cookbook.

miscweb2002.codfw.wmnet, miscweb1002.eqiad.wmnet

Steps for service owner:

  • - all system services confirmed offline from production use
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place. (likely done by script)
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/hiera/dsh config removed
  • - remove site.pp, replace with role(spare::system) recommended to ensure services offline but not 100% required as long as the decom script is IMMEDIATELY run below.
  • - login to cumin host and run the decom cookbook: cookbook sre.hosts.decommission <host fqdn> -t <phab task>. This does: bootloader wipe, host power down, netbox update to decommissioning status, puppet node clean, puppet node deactivate, debmonitor removal, and run homer.
  • - remove all remaining puppet references and all host entries in the puppet repo
  • - reassign task from service owner to DC ops team member and site project (ops-sitename) depending on site of server

End service owner steps / Begin DC-Ops team steps:

NOT RELEVANT, VIRTUAL MACHINE, decom cookbook ran though

  • - system disks removed (by onsite)
  • - determine system age, under 5 years are reclaimed to spare, over 5 years are decommissioned.
  • - IF DECOM: system unracked and decommissioned (by onsite), update netbox with result and set state to offline
  • - IF DECOM: mgmt dns entries removed.
  • - IF RECLAIM: set netbox state to 'inventory' and hostname to asset tag

Related Objects

StatusSubtypeAssignedTask
OpenNone
ResolvedDzahn
ResolvedArnoldokoth
ResolvedArnoldokoth
ResolvedArnoldokoth
Resolved eoghan
Resolvedandrea.denisse
Resolvedandrea.denisse
Resolvedhashar
ResolvedDzahn
ResolvedDzahn
Resolved eoghan
Resolved eoghan
Resolved eoghan
ResolvedDzahn
ResolvedDzahn
ResolvedDzahn
ResolvedNone
Resolvedhashar
ResolvedDzahn
Resolvedhashar
DeclinedNone
Resolvedhashar
ResolvedDzahn
ResolvedDzahn
ResolvedDzahn
ResolvedDzahn
ResolvedJclark-ctr
ResolvedBUG REPORThashar
ResolvedJelto
ResolvedDzahn
ResolvedLegoktm
ResolvedDzahn
ResolvedMoritzMuehlenhoff
InvalidNone
Resolved eoghan
Resolved eoghan
Resolved eoghan
Resolved eoghan
Resolved eoghan
Resolved eoghan
Resolved eoghan
Resolved eoghan
Resolved eoghan
Resolved eoghan
ResolvedLadsgroup
DuplicateNone
ResolvedDzahn
ResolvedRequestDzahn

Event Timeline

Dzahn changed the task status from Open to In Progress.
Dzahn claimed this task.

Change 905748 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/dns@master] remove miscweb2002, was commented out fail-over machine

https://gerrit.wikimedia.org/r/905748

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: miscweb2002.codfw.wmnet

  • miscweb2002.codfw.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster codfw to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster codfw to Netbox

Change 905748 merged by Dzahn:

[operations/dns@master] remove miscweb2002, was commented out fail-over machine

https://gerrit.wikimedia.org/r/905748

Change 902229 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] miscweb/site: remove miscweb2002 from site

https://gerrit.wikimedia.org/r/902229

Dzahn updated the task description. (Show Details)

Change 902229 merged by Dzahn:

[operations/puppet@production] miscweb/site: remove miscweb2002 from site

https://gerrit.wikimedia.org/r/902229

Dzahn renamed this task from decommission miscweb2002.codfw.wmnet to decommission miscweb2002.codfw.wmnet / miscweb1002.eqiad.wmnet.Apr 10 2023, 10:54 PM
Dzahn updated the task description. (Show Details)

Change 907546 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/dns@master] remove miscweb1002->webserver-misc-apps

https://gerrit.wikimedia.org/r/907546

Change 907547 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] site/miscweb: remove miscweb1002, switch rsync source to miscweb1003

https://gerrit.wikimedia.org/r/907547

Dzahn raised the priority of this task from Medium to High.

Change 907546 merged by Dzahn:

[operations/dns@master] remove miscweb1002->webserver-misc-apps

https://gerrit.wikimedia.org/r/907546

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: miscweb1002.eqiad.wmnet

  • miscweb1002.eqiad.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster eqiad to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster eqiad to Netbox

Change 907547 merged by Dzahn:

[operations/puppet@production] site/miscweb: remove miscweb1002, switch rsync source to miscweb1003

https://gerrit.wikimedia.org/r/907547