Page MenuHomePhabricator

Migrating esams to routed Ganeti
Closed, ResolvedPublic

Description

An intro on routed Ganeti can be found here: https://phabricator.wikimedia.org/phame/post/view/312/ganeti_on_modern_network_design/

Routed ganeti is already running in magru (and two test systems in codfw). Next we'll migrate esams to it

Esams currently consists of two separate Ganeti clusters in two different rows with two servers each.

row BY27: ganeti3005 and ganeti3007

  • atlas3001
  • bast3007
  • doh3003
  • durum3003
  • ncredir3003

row BW27: ganeti3006 and ganeti3008

  • doh3004
  • durum3004
  • install3003
  • ncredir3004
  • netflow3003
  • prometheus3003

When the migration is completed, we'll have a common four node Ganeti cluster spanning the two rows (and would also have flexibility in case of potential row changes at the DC).

The migration path will look like the following:

  • Announce that people move away from bast3007 and use the bastion in drmrs for now
  • Deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/1180085
  • Decom atlas3001 (will be re-added later)
  • Decom bast3007 (will be re-added later)
  • Move all VMs in ganeti3005 to ganeti3007
  • Switch BY27 VMs to plain disk storage, i.e. disable DRBD for them.

During this initial period, the BY27 VMs are no longer redundant

  • Allocate IPs for esams routed Ganeti - https://netbox.wikimedia.org/ipam/prefixes/?role_id=41&site_id=1
  • Add allocated IPs to modules/network/data/data.yaml in Puppet - https://gerrit.wikimedia.org/r/c/operations/puppet/+/1180083
  • Add ganeti "customer" to Homer with the esams ranges - https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1180081
  • Manually create the first IPs in Netbox to be able to add the DNS PTRs includes
  • Reimage ganeti3005 with routed Ganeti T403375
  • Update ganeti3005 switch port to remove the trunked public vlan
  • Setup routing between ganeti3005 and its ToR switch
  • Create doh3005, durum3005 and ncredir3005 on routed Ganeti and fail over services
  • Decom ncredir3004, netflow3003, prometheus3003
  • Reimage ganeti3007 with routed Ganeti
  • Update ganeti3007 switch port to remove the trunked public vlan
  • Setup routing between ganeti3007 and its ToR switch
  • Remove esams01 from Netbox sync
  • Move all VMs on ganeti3006 to ganeti3008
  • Switch BW27 VMs to plain disk storage, i.e. disable DRBD for them.
  • Reimage ganeti3006 with routed Ganeti
  • Update ganeti3006 switch port to remove the trunked public vlan
  • Setup routing between ganeti3006 and its ToR switch
  • Create atlas3001 - T403580
  • Create bast3007 on routed Ganeti
  • Switch ncredir3005, durum3005, doh3005 to DRBD
  • Create doh3006 on routed Ganeti, when done decom doh3004
  • Create durum3006 on routed Ganeti, when done decom durum3004
  • Create ncredir3006 on routed Ganeti, when done decom ncredir3004
  • Create install3004 on routed Ganeti, when done decom install3004
  • Update DHCP relay config on the switches to point to the new install3004
  • Point webproxy to the new install3004
  • Create netflow3004 on routed Ganeti, when done decom netflow3003
  • Update netflow config on esams network equipment to point to the new netflow3004
  • Create prometheus3004 on routed Ganeti with insetup role and pass on to o11y to migrate existing metrics, when done decom prometheus3003
  • Remove esams02 from Netbox sync
  • When empty of running VMs, reimage ganeti3008 with routed Ganeti
  • Update ganeti3008 switch port to remove the trunked public vlan
  • Setup routing between ganeti3008 and its ToR switch

Details

Related Changes in Gerrit:
SubjectRepoBranchLines +/-
operations/puppetproduction+1 -1
operations/puppetproduction+2 -5
operations/puppetproduction+1 -6
operations/puppetproduction+0 -3
operations/puppetproduction+0 -3
operations/homer/publicmaster+2 -2
operations/puppetproduction+1 -1
operations/dnsmaster+1 -1
labs/privatemaster+0 -0
operations/puppetproduction+1 -1
operations/puppetproduction+0 -1
operations/puppetproduction+1 -4
operations/puppetproduction+1 -0
operations/puppetproduction+0 -1
operations/puppetproduction+0 -4
operations/homer/publicmaster+1 -1
operations/puppetproduction+1 -4
operations/puppetproduction+1 -4
operations/puppetproduction+20 -0
operations/puppetproduction+2 -5
operations/puppetproduction+3 -1
operations/puppetproduction+1 -6
operations/puppetproduction+0 -3
operations/puppetproduction+0 -7
operations/puppetproduction+0 -11
operations/puppetproduction+0 -1
operations/puppetproduction+3 -6
operations/puppetproduction+1 -4
operations/puppetproduction+4 -0
operations/puppetproduction+1 -4
operations/puppetproduction+1 -4
operations/puppetproduction+12 -0
operations/puppetproduction+3 -0
operations/puppetproduction+5 -8
operations/dnsmaster+15 -0
operations/homer/publicmaster+14 -0
operations/puppetproduction+3 -9
operations/puppetproduction+6 -0
operations/puppetproduction+1 -3
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host bast3007.wikimedia.org with OS bookworm

Change #1184062 merged by Muehlenhoff:

[operations/puppet@production] Remove ncredir3003

https://gerrit.wikimedia.org/r/1184062

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: ncredir3003.esams.wmnet

  • ncredir3003.esams.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster esams01 to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster esams01 to Netbox

Change #1184734 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Drop esams01 cluster and reimage ganeti3007

https://gerrit.wikimedia.org/r/1184734

Change #1184735 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove esams01 from Netbox sync

https://gerrit.wikimedia.org/r/1184735

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host bast3007.wikimedia.org with OS bookworm completed:

  • bast3007 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202509041043_jmm_1995771_bast3007.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Change #1184742 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Readd bast3007 as bastion node

https://gerrit.wikimedia.org/r/1184742

Change #1184735 merged by Muehlenhoff:

[operations/puppet@production] Remove esams01 from Netbox sync

https://gerrit.wikimedia.org/r/1184735

Change #1184734 merged by Muehlenhoff:

[operations/puppet@production] Drop esams01 cluster and reimage ganeti3007

https://gerrit.wikimedia.org/r/1184734

Change #1184742 merged by Muehlenhoff:

[operations/puppet@production] Readd bast3007 as bastion node

https://gerrit.wikimedia.org/r/1184742

Change #1184774 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add ganeti3007 to the esams03 cluster

https://gerrit.wikimedia.org/r/1184774

Change #1184774 merged by Muehlenhoff:

[operations/puppet@production] Add ganeti3007 to the esams03 cluster

https://gerrit.wikimedia.org/r/1184774

VM ncredir3005.esams.wmnet switching disk type to drbd

Change #1184967 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add replacement insetup VMS for VMs currently running on esams02

https://gerrit.wikimedia.org/r/1184967

Change #1184967 merged by Muehlenhoff:

[operations/puppet@production] Add replacement insetup VMS for VMs currently running on esams02

https://gerrit.wikimedia.org/r/1184967

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host doh3006.wikimedia.org with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host doh3006.wikimedia.org with OS bookworm completed:

  • doh3006 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202509050723_jmm_2613787_doh3006.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Draining ganeti3007.esams.wmnet of running VMs

Draining ganeti3007.esams.wmnet of running VMs

Change #1185047 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Make doh3006 a wikidough node

https://gerrit.wikimedia.org/r/1185047

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host durum3006.esams.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host durum3006.esams.wmnet with OS bookworm completed:

  • durum3006 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202509051252_jmm_2780597_durum3006.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Change #1185094 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Make durum3006 a durum node

https://gerrit.wikimedia.org/r/1185094

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host ncredir3006.esams.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host ncredir3006.esams.wmnet with OS bookworm completed:

  • ncredir3006 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202509051413_jmm_2825207_ncredir3006.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Change #1185107 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add ncredir3006

https://gerrit.wikimedia.org/r/1185107

Change #1185107 merged by Muehlenhoff:

[operations/puppet@production] Add ncredir3006

https://gerrit.wikimedia.org/r/1185107

Change #1185709 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove ncredir3004

https://gerrit.wikimedia.org/r/1185709

Change #1185710 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Apply netinsights role to netflow3004

https://gerrit.wikimedia.org/r/1185710

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host netflow3004.esams.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host netflow3004.esams.wmnet with OS bookworm completed:

  • netflow3004 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202509080819_jmm_558602_netflow3004.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Change #1185859 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/homer/public@master] esams: update netflow3003 to 3004

https://gerrit.wikimedia.org/r/1185859

Change #1185710 merged by Muehlenhoff:

[operations/puppet@production] Apply netinsights role to netflow3004

https://gerrit.wikimedia.org/r/1185710

Change #1185859 merged by jenkins-bot:

[operations/homer/public@master] esams: update netflow3003 to 3004

https://gerrit.wikimedia.org/r/1185859

Change #1185094 merged by Muehlenhoff:

[operations/puppet@production] Make durum3006 a durum node

https://gerrit.wikimedia.org/r/1185094

Change #1185873 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/puppet@production] Kafka: remove netflow3003 ACL before decom

https://gerrit.wikimedia.org/r/1185873

Change #1185873 merged by Ayounsi:

[operations/puppet@production] Kafka: remove netflow3003 ACL before decom

https://gerrit.wikimedia.org/r/1185873

Change #1185876 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Also enable new Bird for durum3006

https://gerrit.wikimedia.org/r/1185876

cookbooks.sre.hosts.decommission executed by ayounsi@cumin1003 for hosts: netflow3003.esams.wmnet

  • netflow3003.esams.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster esams02 to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster esams02 to Netbox

Change #1185876 merged by Muehlenhoff:

[operations/puppet@production] Also enable new Bird for durum3006

https://gerrit.wikimedia.org/r/1185876

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host install3004.wikimedia.org with OS bookworm

Change #1185047 merged by Muehlenhoff:

[operations/puppet@production] Make doh3006 a wikidough node

https://gerrit.wikimedia.org/r/1185047

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host install3004.wikimedia.org with OS bookworm completed:

  • install3004 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202509081053_jmm_613442_install3004.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Change #1185709 merged by Muehlenhoff:

[operations/puppet@production] Remove ncredir3004

https://gerrit.wikimedia.org/r/1185709

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: ncredir3004.esams.wmnet

  • ncredir3004.esams.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster esams02 to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster esams02 to Netbox

VM durum3005.esams.wmnet switching disk type to drbd

Change #1185898 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Apply installserver role to install3004

https://gerrit.wikimedia.org/r/1185898

VM doh3005.wikimedia.org switching disk type to drbd

Change #1185898 merged by Muehlenhoff:

[operations/puppet@production] Apply installserver role to install3004

https://gerrit.wikimedia.org/r/1185898

Change #1185918 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/dns@master] Point webproxy in esams to install3004

https://gerrit.wikimedia.org/r/1185918

Change #1185924 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[labs/private@master] Add dummy keytab for install3004

https://gerrit.wikimedia.org/r/1185924

Change #1185924 merged by Muehlenhoff:

[labs/private@master] Add dummy keytab for install3004

https://gerrit.wikimedia.org/r/1185924

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: durum3004.esams.wmnet

  • durum3004.esams.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster esams02 to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster esams02 to Netbox

Change #1185938 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Apply config to enable new Bird release on the role/esams level

https://gerrit.wikimedia.org/r/1185938

Change #1185940 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/homer/public@master] Update DHCP server in esams

https://gerrit.wikimedia.org/r/1185940

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: doh3004.wikimedia.org

  • doh3004.wikimedia.org (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster esams02 to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster esams02 to Netbox

Change #1185941 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Update DHCP server in esams

https://gerrit.wikimedia.org/r/1185941

Change #1185918 merged by Muehlenhoff:

[operations/dns@master] Point webproxy in esams to install3004

https://gerrit.wikimedia.org/r/1185918

Change #1185941 merged by Muehlenhoff:

[operations/puppet@production] Update DHCP server in esams

https://gerrit.wikimedia.org/r/1185941

Change #1185940 merged by Muehlenhoff:

[operations/homer/public@master] Update DHCP server in esams

https://gerrit.wikimedia.org/r/1185940

Change #1185938 merged by Muehlenhoff:

[operations/puppet@production] Apply config to enable new Bird release on the role/esams level

https://gerrit.wikimedia.org/r/1185938

Change #1186438 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove Ganeti role from ganeti3008

https://gerrit.wikimedia.org/r/1186438

Change #1186439 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove ganeti02/esams from Netbox sync

https://gerrit.wikimedia.org/r/1186439

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: install3003.wikimedia.org

  • install3003.wikimedia.org (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster esams02 to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster esams02 to Netbox

Change #1186439 merged by Muehlenhoff:

[operations/puppet@production] Remove ganeti02/esams from Netbox sync

https://gerrit.wikimedia.org/r/1186439

Change #1186438 merged by Muehlenhoff:

[operations/puppet@production] Remove Ganeti role from ganeti3008

https://gerrit.wikimedia.org/r/1186438

Change #1186516 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add ganeti3008 to the routed Ganeti cluster in esams

https://gerrit.wikimedia.org/r/1186516

Change #1186516 merged by Muehlenhoff:

[operations/puppet@production] Add ganeti3008 to the routed Ganeti cluster in esams

https://gerrit.wikimedia.org/r/1186516

MoritzMuehlenhoff claimed this task.

esams is fully migrated to routed Ganeti!

Mentioned in SAL (#wikimedia-operations) [2025-09-10T05:46:49Z] <moritzm> rebalance ganeti03 in esams T402259

Change #1186925 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Update Ganeti alias for esams

https://gerrit.wikimedia.org/r/1186925

Change #1186925 merged by Muehlenhoff:

[operations/puppet@production] Update Ganeti alias for esams

https://gerrit.wikimedia.org/r/1186925

There was a small issue with install3004, it lacked the global ipv6 address, which caused failing ipv6 probes to Squid. The relevant config snippet for /etc/network/interfaces gets added in the late-setup script of the installer. It's unclear why it failed, but the VM was created on the day after the Bookworm 12.12 point release, so possibly the VM creation was made before the netboot-image had been updated and that caused some silent failure. I've fixed up /e/n/i manually and now ipv6 connectivity works fine again.