Page MenuHomePhabricator

Migrate Failoid hosts to Stretch/Buster
Closed, ResolvedPublic

Description

These are currently running jessie, but should be fairly simple to migrate. If we move to new Ganeti instances, maybe we should also switch to a name like failoid1001, which is better at indicating the DC?

Details

Related Gerrit Patches:
operations/dns : masterRemove DNS entries for roentgenium/tureis
operations/puppet : productionRemove roentgenium/tureis
operations/puppet : productionSwitch Failoid in eqiad to failoid1001
operations/puppet : productionSwitch Failoid in codfw to failoid2001
operations/puppet : productionAdd failoid1001/2001 to site.pp

Event Timeline

+1 on the naming and +1 on buster, they just have firewall rules, so should be pretty straightforward and easy to do.

ema moved this task from Triage to Watching on the Traffic board.Jun 3 2019, 3:10 PM
ArielGlenn triaged this task as Normal priority.Jun 11 2019, 7:58 AM

Change 531141 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Add failoid1001/2001 to site.pp

https://gerrit.wikimedia.org/r/531141

Change 531141 merged by Muehlenhoff:
[operations/puppet@production] Add failoid1001/2001 to site.pp

https://gerrit.wikimedia.org/r/531141

Change 531165 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Switch Failoid in codfw to failoid2001

https://gerrit.wikimedia.org/r/531165

Change 531165 merged by Muehlenhoff:
[operations/puppet@production] Switch Failoid in codfw to failoid2001

https://gerrit.wikimedia.org/r/531165

Change 531706 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Switch Failoid in eqiad to failoid1001

https://gerrit.wikimedia.org/r/531706

Change 531706 merged by Muehlenhoff:
[operations/puppet@production] Switch Failoid in eqiad to failoid1001

https://gerrit.wikimedia.org/r/531706

New VMs (failoid1001 and failoid2001) have been setup and are in active use now. I'll keep the old jessie VMs around for a few weeks "just in case".

Change 534017 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Remove roentgenium/tureis

https://gerrit.wikimedia.org/r/534017

Change 534019 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/dns@master] Remove DNS entries for roentgenium/tureis

https://gerrit.wikimedia.org/r/534019

Change 534017 merged by Muehlenhoff:
[operations/puppet@production] Remove roentgenium/tureis

https://gerrit.wikimedia.org/r/534017

cookbooks.sre.hosts.decommission executed by jmm@cumin2001 for hosts: tureis.codfw.wmnet

  • tureis.codfw.wmnet
    • Removed from Puppet master and PuppetDB
    • Downtimed host on Icinga
    • No management interface found (likely a VM)
    • Removed from DebMonitor

cookbooks.sre.hosts.decommission executed by jmm@cumin2001 for hosts: roentgenium.eqiad.wmnet

  • roentgenium.eqiad.wmnet
    • Removed from Puppet master and PuppetDB
    • Downtimed host on Icinga
    • No management interface found (likely a VM)
    • Removed from DebMonitor

Mentioned in SAL (#wikimedia-operations) [2019-09-11T10:23:02Z] <moritzm> removed roentgenium/tureis in Ganeti T224559

Change 534019 merged by Muehlenhoff:
[operations/dns@master] Remove DNS entries for roentgenium/tureis

https://gerrit.wikimedia.org/r/534019

MoritzMuehlenhoff closed this task as Resolved.Sep 11 2019, 10:33 AM

New instances (failoid1001 and failoid2001) have been set up with Buster and are in use. The old instances (roentgenium and tureis) have been removed.