Page MenuHomePhabricator

Decommission mw2135-mw2147, mw2187-mw2214 (all PowerEdge R420)
Closed, ResolvedPublic

Description

We added new servers on codfw T247021 but we didn't decomm the hosts scheduled to retire. With the upcoming DC switchover we need to speed this up as keeping those hosts in the cluster will just increase the overall latency

This ticket is about all Dell PowerEdge R420 servers that have been procured in RT ticket #9011 in 2015 aka "type A hardware".

Event Timeline

Change 621783 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] decom mw2135 through mw2214

https://gerrit.wikimedia.org/r/621783

Change 621786 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] decom mw2135 through mw2214

https://gerrit.wikimedia.org/r/621786

https://docs.google.com/spreadsheets/d/1rtg4DMx4glZA6T_XVLzt_OlFHQx53Eb_U8criLzCQs4/edit?usp=sharing shows in the TOTALs sheet how this affects the balance of servers between eqiad and codfw. It is still around 50% of servers in each DC.

Dzahn triaged this task as High priority.Aug 26 2020, 7:00 PM

High.. _if_ we want it to happen before the switch. We could of course also just set them to "inactive" / weight 0 and do the rest later but have the same effect for traffic.

@Dzahn if you're going ahead with this please give me a heads up as I have a patch to merge for the decom cookbook and would like to see it works fine in real life

@Volans Sure, I am currently just waiting for the ok from other subscribers here (https://gerrit.wikimedia.org/r/c/operations/puppet/+/621783)

Mentioned in SAL (#wikimedia-operations) [2020-08-27T16:48:09Z] <mutante> depooling mw2187 - mw2199 - old codfw appservers of type A to be decom'ed, previously weight 10 (T260654)

Dzahn renamed this task from Decommission mw[2135-2214].codfw.wmnet to Decommission mw2187-mw2199, mw2135-mw2147, mw2200-mw2214 (all PowerEdge R420).Aug 27 2020, 6:39 PM
Dzahn added a project: SRE.
Dzahn updated the task description. (Show Details)
Dzahn updated the task description. (Show Details)

Change 621783 merged by Dzahn:
[operations/puppet@production] decom mw2187-mw2199, mw2135-mw2147, mw2200-mw2214

https://gerrit.wikimedia.org/r/621783

Change 622879 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] DHCP: remove mw2187-mw2199, mw2135-mw2147, mw2200-mw2214

https://gerrit.wikimedia.org/r/622879

Change 622879 merged by Dzahn:
[operations/puppet@production] DHCP: remove mw2187-mw2199, mw2135-mw2147, mw2200-mw2214

https://gerrit.wikimedia.org/r/622879

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw2187.codfw.wmnet

  • mw2187.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw2189.codfw.wmnet

  • mw2189.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[2190-2194].codfw.wmnet

  • mw2190.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2191.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2192.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2193.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2194.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw2195.codfw.wmnet

  • mw2195.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[2197-2199].codfw.wmnet

  • mw2197.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2198.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2199.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[2135-2139].codfw.wmnet

  • mw2135.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2136.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2137.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2138.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2139.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[2140-2144].codfw.wmnet

  • mw2140.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2141.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2142.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2143.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2144.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[2145-2147].codfw.wmnet

  • mw2145.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2146.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2147.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[2200-2204].codfw.wmnet

  • mw2200.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2201.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2202.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2203.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2204.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[2205-2209].codfw.wmnet

  • mw2205.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2206.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2207.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2208.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2209.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[2210-2212,2214].codfw.wmnet

  • mw2210.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2211.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2212.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw2214.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 622895 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] scap: remove proxy for codfw C4, mw2188

https://gerrit.wikimedia.org/r/622895

Change 622895 merged by Dzahn:
[operations/puppet@production] scap: remove proxy for codfw C4, mw2188

https://gerrit.wikimedia.org/r/622895

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw2188.codfw.wmnet

  • mw2188.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 622900 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] mediawiki: replace mw2196 with mw2336 as mcrouter proxy

https://gerrit.wikimedia.org/r/622900

All are decom'ed and done except 1 host, mw2196, which is an mcrouter proxy.

Change 622902 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site: remove mw2187-mw2195, mw2197-mw2199, mw2135-mw2147, mw2200-mw2214

https://gerrit.wikimedia.org/r/622902

Dzahn renamed this task from Decommission mw2187-mw2199, mw2135-mw2147, mw2200-mw2214 (all PowerEdge R420) to Decommission mw2135-mw2147, mw2187-mw2199, mw2200-mw2214 (all PowerEdge R420).Aug 27 2020, 11:01 PM

Change 622902 merged by Dzahn:
[operations/puppet@production] site: remove mw2187-mw2195, mw2197-mw2199, mw2135-mw2147, mw2200-mw2214

https://gerrit.wikimedia.org/r/622902

Change 622906 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site: remove mw2189-mw2195,mw2197-mw2199

https://gerrit.wikimedia.org/r/622906

Change 622906 merged by Dzahn:
[operations/puppet@production] site: remove mw2189-mw2195,mw2197-mw2199

https://gerrit.wikimedia.org/r/622906

Change 622900 merged by Dzahn:
[operations/puppet@production] mediawiki: replace mw2196 with mw2336 as mcrouter proxy

https://gerrit.wikimedia.org/r/622900

Change 623022 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site: remove mw2196.codfw.wmnet

https://gerrit.wikimedia.org/r/623022

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw2196.codfw.wmnet

  • mw2196.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 623022 merged by Dzahn:
[operations/puppet@production] site: remove mw2196.codfw.wmnet

https://gerrit.wikimedia.org/r/623022

Dzahn renamed this task from Decommission mw2135-mw2147, mw2187-mw2199, mw2200-mw2214 (all PowerEdge R420) to Decommission mw2135-mw2147, mw2187-mw2214 (all PowerEdge R420).Aug 28 2020, 5:41 PM

Change 623031 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] remove mw2135-mw2147 and mw2187-mw2214

https://gerrit.wikimedia.org/r/623031

Change 623031 merged by Dzahn:
[operations/dns@master] remove mw2135-mw2147 and mw2187-mw2214

https://gerrit.wikimedia.org/r/623031

All steps that need to be done by us as the service owner are done.

40 hosts removed from repos, shutdown and wiped bootloader by decom cookbook, production IPs removed.

I made a separate ticket to hand-over to dcops because that needs 40 x the decom template, which means 680 checkboxes :)

to be continued on T261524

Change 621786 abandoned by Dzahn:
[operations/dns@master] decom mw2135 through mw2214

Reason:
duplicate of https://gerrit.wikimedia.org/r/c/operations/dns/ /623031

https://gerrit.wikimedia.org/r/621786