Page MenuHomePhabricator

Upgrade netflow VMs to Bullseye
Closed, ResolvedPublic

Description

While working on T263277 I hit a sfacctd bug that is solved in its Bullseye package.

Easiest at this point is to re-create the netflow* VMs to Bullseye as this will to happen at some point anyway.

Event Timeline

ayounsi triaged this task as Medium priority.Dec 13 2021, 12:24 PM
ayounsi created this task.

Change 746853 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/puppet@production] Make netboot.cfg generic for netflow VMs

https://gerrit.wikimedia.org/r/746853

Change 746853 merged by Ayounsi:

[operations/puppet@production] Make netboot.cfg generic for netflow VMs

https://gerrit.wikimedia.org/r/746853

Change 746862 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/puppet@production] Add netflow2002 to DHCP

https://gerrit.wikimedia.org/r/746862

Change 746862 merged by Ayounsi:

[operations/puppet@production] Add netflow2002 to DHCP

https://gerrit.wikimedia.org/r/746862

Change 746863 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/puppet@production] Include new netflow VMs in site.pp

https://gerrit.wikimedia.org/r/746863

Change 746863 merged by Ayounsi:

[operations/puppet@production] Include new netflow VMs in site.pp

https://gerrit.wikimedia.org/r/746863

Mentioned in SAL (#wikimedia-operations) [2021-12-13T14:34:33Z] <moritzm> imported fastnetmon 1.1.7+deb11u1 for bullseye-wikimedia https://phabricator.wikimedia.org/T297595

Change 747047 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/homer/public@master] Update netflow collector for codfw/eqdfw to netflow2002

https://gerrit.wikimedia.org/r/747047

Change 747047 merged by Ayounsi:

[operations/homer/public@master] Update netflow collector for codfw/eqdfw to netflow2002

https://gerrit.wikimedia.org/r/747047

Change 747050 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/puppet@production] Add new netflow hosts to Kafka jumbo ACL

https://gerrit.wikimedia.org/r/747050

Change 747050 merged by Ayounsi:

[operations/puppet@production] Add new netflow hosts to Kafka jumbo ACL

https://gerrit.wikimedia.org/r/747050

Change 747052 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/puppet@production] Add DHCP for new netflow VMs

https://gerrit.wikimedia.org/r/747052

Change 747052 merged by Ayounsi:

[operations/puppet@production] Add DHCP for new netflow VMs

https://gerrit.wikimedia.org/r/747052

Change 747056 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/homer/public@master] Update all netflow collectors to new VMs

https://gerrit.wikimedia.org/r/747056

Change 747056 merged by jenkins-bot:

[operations/homer/public@master] Update all netflow collectors to new VMs

https://gerrit.wikimedia.org/r/747056

Change 747063 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/puppet@production] Remove old netflow hosts from kafka jumbo acl

https://gerrit.wikimedia.org/r/747063

cookbooks.sre.hosts.decommission executed by ayounsi@cumin1001 for hosts: netflow2001.codfw.wmnet

  • netflow2001.codfw.wmnet (FAIL)
    • Downtimed host on Icinga
    • Host steps raised exception: Error while performing request to RAPI

ERROR: some step on some host failed, check the bolded items above

Change 747063 merged by Ayounsi:

[operations/puppet@production] Remove old netflow hosts from kafka jumbo acl

https://gerrit.wikimedia.org/r/747063

cookbooks.sre.hosts.decommission executed by ayounsi@cumin1001 for hosts: netflow4001.ulsfo.wmnet

  • netflow4001.ulsfo.wmnet (PASS)
    • Downtimed host on Icinga
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.ulsfo.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.ulsfo.wmnet to Netbox

cookbooks.sre.hosts.decommission executed by ayounsi@cumin1001 for hosts: netflow2001.codfw.wmnet

  • netflow2001.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.codfw.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.codfw.wmnet to Netbox

cookbooks.sre.hosts.decommission executed by ayounsi@cumin1001 for hosts: netflow1001.eqiad.wmnet

  • netflow1001.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox

cookbooks.sre.hosts.decommission executed by ayounsi@cumin1001 for hosts: netflow3001.esams.wmnet

  • netflow3001.esams.wmnet (PASS)
    • Downtimed host on Icinga
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.esams.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.esams.wmnet to Netbox

cookbooks.sre.hosts.decommission executed by ayounsi@cumin1001 for hosts: netflow5001.eqsin.wmnet

  • netflow5001.eqsin.wmnet (PASS)
    • Downtimed host on Icinga
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqsin.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqsin.wmnet to Netbox
ayounsi claimed this task.

All done!