Page MenuHomePhabricator

Upgrade debmonitor to Buster
Closed, ResolvedPublic

Description

Upgrade debmonitor to Buster. Since those are VMs it's probably simplest to create new VMs, setup a Buster installation in parallel and then switch over.

The clients are detecting the debmonitor server via a discovery record. The docker-report image injection script also uses debmonitor-client, so doesn't need a separate change.

Event Timeline

Change 623229 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Add debmonitor::server role to debmonitor2002

https://gerrit.wikimedia.org/r/623229

MoritzMuehlenhoff updated the task description. (Show Details)

Change 623335 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/software/debmonitor/deploy@master] Create artifacts for 0.2.7 on Buster

https://gerrit.wikimedia.org/r/623335

Change 623335 merged by Muehlenhoff:
[operations/software/debmonitor/deploy@master] Create artifacts for 0.2.7 on Buster

https://gerrit.wikimedia.org/r/623335

Change 623229 merged by Muehlenhoff:
[operations/puppet@production] Add debmonitor::server role to debmonitor2002

https://gerrit.wikimedia.org/r/623229

Change 624066 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Add debmonitor1002 as debmonitor server

https://gerrit.wikimedia.org/r/624066

Change 624066 merged by Muehlenhoff:
[operations/puppet@production] Add debmonitor1002 as debmonitor server

https://gerrit.wikimedia.org/r/624066

Change 625591 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/dns@master] Switch debmonitor to debmonitor1002

https://gerrit.wikimedia.org/r/625591

Change 625591 merged by Muehlenhoff:
[operations/dns@master] Switch debmonitor to debmonitor1002

https://gerrit.wikimedia.org/r/625591

debmonitor.wikimedia.org is now served by debmonitor1002 running Buster and everything is working well. I'm keeping the old VMs throughout the week before tearing them down.

Change 625833 had a related patch set uploaded (by Volans; owner: Volans):
[operations/homer/public@master] cr: update debmonitor IPs in firewall rules

https://gerrit.wikimedia.org/r/625833

Change 625833 merged by jenkins-bot:
[operations/homer/public@master] cr: update debmonitor IPs in firewall rules

https://gerrit.wikimedia.org/r/625833

Mentioned in SAL (#wikimedia-operations) [2020-09-08T08:45:01Z] <volans> running homer 'cr*eqiad*' commit "Update debmonitor IPs, T261489"

Change 625840 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/dns@master] Remove debmonitor1001/2001

https://gerrit.wikimedia.org/r/625840

Change 625841 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Remove debmonitor1001/2001

https://gerrit.wikimedia.org/r/625841

Change 625844 had a related patch set uploaded (by Volans; owner: Volans):
[operations/homer/public@master] cr: update debmonitor IPs in firewall rules (#2)

https://gerrit.wikimedia.org/r/625844

Change 625844 merged by jenkins-bot:
[operations/homer/public@master] cr: update debmonitor IPs in firewall rules (#2)

https://gerrit.wikimedia.org/r/625844

Mentioned in SAL (#wikimedia-operations) [2020-09-08T09:37:51Z] <volans> running homer 'cr*eqiad*' commit "Update debmonitor IPs (#2), T261489"

cookbooks.sre.hosts.decommission executed by volans@cumin1001 for hosts: debmonitor2001.codfw.wmnet

  • debmonitor2001.codfw.wmnet (WARN)
    • Downtimed host on Icinga
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.codfw.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Site codfw DNS records not yet migrated to the automatic system, manual patch required

cookbooks.sre.hosts.decommission executed by jmm@cumin1001 for hosts: debmonitor1001.eqiad.wmnet

  • debmonitor1001.eqiad.wmnet (WARN)
    • Downtimed host on Icinga
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Site eqiad DNS records not yet migrated to the automatic system, manual patch required

Change 625841 merged by Muehlenhoff:
[operations/puppet@production] Remove debmonitor1001/2001

https://gerrit.wikimedia.org/r/625841

cookbooks.sre.hosts.decommission executed by volans@cumin1001 for hosts: debmonitor2001.codfw.wmnet

  • debmonitor2001.codfw.wmnet (FAIL)
    • Failed downtime host on Icinga (likely already removed)
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.codfw.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Site codfw DNS records not yet migrated to the automatic system, manual patch required

ERROR: some step on some host failed, check the bolded items above

Change 625840 merged by Muehlenhoff:
[operations/dns@master] Remove debmonitor1001/2001

https://gerrit.wikimedia.org/r/625840

The old stretch instances (debmonitor1001/2001) have been removed.