Page MenuHomePhabricator

Move maps servers to Bookworm
Closed, ResolvedPublic

Authored By
MoritzMuehlenhoff
Dec 5 2024, 11:33 AM
Referenced Files
F67979074: download.png
Oct 28 2025, 2:37 PM
F67979061: download.png
Oct 28 2025, 2:37 PM
F66747148: diff.png
Oct 13 2025, 8:31 AM
F66747146: b.png
Oct 13 2025, 8:31 AM
F66747144: a.png
Oct 13 2025, 8:31 AM
F66747137: diff.png
Oct 13 2025, 8:31 AM
F66747135: b.png
Oct 13 2025, 8:31 AM
F66747132: a.png
Oct 13 2025, 8:31 AM

Description

kartotherian is being moved to wikikube. When that is complete, the server backend will be moved to Bookworm.

We'll be reusing six former ganeti nodes a test cluster to test the new stack (setup at T380144) until we eventually reimage the maps* nodes.

The high level plan is:

  • install a master bookworm node and fix all issues
  • install a replica bookworm node and fix all issues
  • the remaing replicas
  • test the OSM import on the new cluster
  • point wikikube/staging to use the bookworm cluster
  • if this works fine, we can enable expiration events for bookworm and disable it for buster
  • move the prod pods to the new cluster and eventually
  • failover production traffic in codfw to the boomworm-test cluster
  • Install the new eqiad/maps nodes with Bookworm
  • Install the new codfw/maps nodes with Bookworm
  • Failover to the new nodes
  • Decom the old servers

Details

Related Changes in Gerrit:
SubjectRepoBranchLines +/-
operations/cookbooksmaster+1 -1
operations/cookbooksmaster+1 -1
operations/puppetproduction+2 -12
labs/privatemaster+0 -2
operations/puppetproduction+1 -0
labs/privatemaster+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+1 -1
operations/puppetproduction+20 -0
operations/puppetproduction+0 -3
operations/puppetproduction+3 -1
operations/puppetproduction+1 -1
operations/puppetproduction+0 -78
operations/puppetproduction+1 -1
operations/puppetproduction+1 -8
operations/puppetproduction+0 -11
labs/privatemaster+0 -124
labs/privatemaster+2 -2
operations/puppetproduction+1 -1
operations/puppetproduction+12 -32
operations/puppetproduction+1 -1
operations/puppetproduction+21 -92
operations/puppetproduction+3 -4
operations/puppetproduction+0 -3
operations/puppetproduction+1 -9
operations/puppetproduction+1 -13
operations/puppetproduction+0 -69
operations/puppetproduction+0 -90
operations/puppetproduction+1 -1
operations/puppetproduction+44 -0
operations/puppetproduction+0 -2
operations/puppetproduction+0 -2
operations/puppetproduction+0 -2
operations/puppetproduction+1 -1
operations/puppetproduction+3 -2
operations/puppetproduction+0 -54
operations/puppetproduction+2 -2
operations/puppetproduction+0 -2
operations/deployment-chartsmaster+1 -9
operations/puppetproduction+2 -0
operations/deployment-chartsmaster+11 -18
operations/puppetproduction+3 -0
operations/puppetproduction+14 -1
operations/puppetproduction+1 -4
operations/puppetproduction+12 -39
operations/puppetproduction+2 -2
operations/deployment-chartsmaster+1 -1
operations/puppetproduction+0 -2
operations/puppetproduction+87 -64
operations/puppetproduction+1 -0
operations/puppetproduction+11 -1
operations/puppetproduction+6 -0
operations/puppetproduction+3 -0
operations/deployment-chartsmaster+15 -3
operations/puppetproduction+8 -1
operations/puppetproduction+5 -12
operations/puppetproduction+1 -1
operations/puppetproduction+11 -1
operations/puppetproduction+1 -1
operations/deployment-chartsmaster+3 -7
operations/deployment-chartsmaster+9 -21
operations/puppetproduction+13 -1
operations/puppetproduction+2 -2
operations/puppetproduction+1 -1
operations/puppetproduction+2 -2
operations/puppetproduction+1 -1
operations/puppetproduction+2 -2
operations/puppetproduction+15 -1
operations/puppetproduction+1 -15
operations/puppetproduction+15 -1
operations/puppetproduction+2 -3
operations/puppetproduction+14 -2
operations/puppetproduction+2 -2
operations/puppetproduction+8 -0
operations/puppetproduction+2 -2
operations/puppetproduction+2 -2
operations/puppetproduction+1 -1
operations/puppetproduction+19 -1
operations/puppetproduction+19 -1
operations/puppetproduction+1 -1
operations/puppetproduction+2 -1
operations/puppetproduction+15 -44
operations/puppetproduction+8 -23
operations/puppetproduction+0 -4
operations/deployment-chartsmaster+13 -33
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+15 -0
operations/puppetproduction+0 -2
operations/puppetproduction+8 -8
operations/deployment-chartsmaster+8 -8
operations/deployment-chartsmaster+4 -0
operations/deployment-chartsmaster+1 -0
operations/deployment-chartsmaster+7 -12
operations/deployment-chartsmaster+16 -7
operations/deployment-chartsmaster+9 -13
operations/puppetproduction+44 -15
operations/puppetproduction+0 -2
operations/puppetproduction+2 -1
operations/puppetproduction+0 -3
operations/puppetproduction+7 -3
operations/puppetproduction+0 -40
operations/puppetproduction+5 -7
operations/mediawiki-configmaster+9 -0
operations/puppetproduction+1 -0
operations/deployment-chartsmaster+7 -7
operations/puppetproduction+1 -0
operations/puppetproduction+2 -0
operations/puppetproduction+1 -5
operations/puppetproduction+2 -2
operations/puppetproduction+5 -1
operations/puppetproduction+2 -0
operations/puppetproduction+924 -4
operations/puppetproduction+751 -17
operations/puppetproduction+1 -1
operations/puppetproduction+0 -154
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+15 -0
operations/puppetproduction+13 -10
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+2 -2
operations/puppetproduction+13 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+2 -2
operations/puppetproduction+29 -13
operations/puppetproduction+2 -4
operations/puppetproduction+5 -1
operations/puppetproduction+10 -3
labs/privatemaster+4 -0
operations/puppetproduction+6 -2
operations/puppetproduction+15 -8
operations/puppetproduction+2 -2
operations/puppetproduction+1 -0
operations/puppetproduction+1 -1
labs/privatemaster+8 -0
operations/puppetproduction+2 -0
operations/puppetproduction+9 -1
operations/puppetproduction+61 -32
operations/puppetproduction+16 -0
operations/puppetproduction+53 -0
operations/puppetproduction+0 -8
operations/puppetproduction+3 -4
operations/puppetproduction+5 -10
operations/debs/osmbordermaster+8 -3
operations/puppetproduction+29 -15
operations/puppetproduction+0 -16
operations/puppetproduction+42 -44
operations/puppetproduction+2 -9
operations/puppetproduction+6 -0
operations/puppetproduction+4 -14
operations/puppetproduction+12 -11
operations/puppetproduction+3 -5
operations/puppetproduction+17 -271
Show related patches Customize query in gerrit

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Of course I am stupid, I re-executed the grants on maps2011 that was already ok. This was the correct action:

elukey@maps1011:~$ sudo -u postgres psql -f /usr/local/bin/maps-grants-gis.sql -d gis
ALTER ROLE
GRANT
GRANT
GRANT
GRANT
ALTER DEFAULT PRIVILEGES
GRANT
GRANT
GRANT
GRANT
GRANT
ALTER DEFAULT PRIVILEGES
GRANT
GRANT
GRANT
GRANT
ALTER DEFAULT PRIVILEGES
ALTER DEFAULT PRIVILEGES

Started the cache warm-up for eqiad, using tegola-swift-codfw-v003 as reference.

To check progress:

  • Kafka events consumption from tegola pregen (starts only once for each day): dashboard

Change #1198614 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] role::maps::master_bookworm: enable tile invalidation in eqiad

https://gerrit.wikimedia.org/r/1198614

Bootstrap completed, and it looks good:

root@thanos-fe1004:~# swift stat tegola-swift-codfw-v003 | grep Objects
                      Objects: 95190783
root@thanos-fe1004:~# swift stat tegola-swift-eqiad-v003 | grep Objects
                      Objects: 94959489

Now we just need to re-enabled tile invalidation and let it catch up.

Change #1198614 merged by Elukey:

[operations/puppet@production] role::maps::master_bookworm: enable tile invalidation in eqiad

https://gerrit.wikimedia.org/r/1198614

Change #1198908 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] imposm-initial-import: Read the permissions from a file

https://gerrit.wikimedia.org/r/1198908

Change #1198908 merged by Muehlenhoff:

[operations/puppet@production] imposm-initial-import: Read the permissions from a file

https://gerrit.wikimedia.org/r/1198908

Change #1199005 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] osm_master: Remove support for pre Bookworm

https://gerrit.wikimedia.org/r/1199005

Change #1199260 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] osm: Remove obsolete spec files

https://gerrit.wikimedia.org/r/1199260

Change #1199260 merged by Muehlenhoff:

[operations/puppet@production] osm: Remove obsolete spec files

https://gerrit.wikimedia.org/r/1199260

Change #1199265 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] osm_sync_lag.sh: Fix default to current directory

https://gerrit.wikimedia.org/r/1199265

Change #1199271 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] maps: Stop installing osm2pgsql and osmborder

https://gerrit.wikimedia.org/r/1199271

Change #1195717 abandoned by Muehlenhoff:

[operations/puppet@production] Shift tile eqiad invalidation to the bookworm master

Reason:

Old patch, no longer needed

https://gerrit.wikimedia.org/r/1195717

Ran the diff testing tool between eqiad and codfw:

|      |     ssim |
|-----:|---------:|
| 0.05 | 0.974994 |
| 0.1  | 0.990161 |
| 0.2  | 0.998943 |
| 0.25 | 0.999358 |
| 0.5  | 0.999917 |
| 0.75 | 1        |
| 0.9  | 1        |
| 0.95 | 1        |
| 0.99 | 1        |

|      |   diff_latency |
|-----:|---------------:|
| 0.1  |     -2332.79   |
| 0.2  |     -1574.56   |
| 0.25 |     -1373.6    |
| 0.5  |      -601.896  |
| 0.75 |      -120.024  |
| 0.9  |        69.9375 |
| 0.95 |       178.929  |
| 0.99 |       596.683  |

download.png (433×571 px, 13 KB)

download.png (833×1 px, 61 KB)

The diff in latency can be probably explained by the fact that codfw is currently handling the whole load, so its performances are affected. Overall I think we are good!

@TheDJ Hi! As FYI we now have eqiad and codfw on the new stack, both eqiad and codfw are pooled :)

I just moved all the traffic to eqiad depooling codfw. This is the last test to make sure the new stack can handle all traffic in case it is needed.

Change #1199265 merged by Muehlenhoff:

[operations/puppet@production] osm_sync_lag.sh: Fix default to current directory

https://gerrit.wikimedia.org/r/1199265

Change #1199271 merged by Muehlenhoff:

[operations/puppet@production] maps: Stop installing osm2pgsql and osmborder

https://gerrit.wikimedia.org/r/1199271

Repooled codfw after the eqiad-only test, I think we are good!

We'll wait a couple more days to be sure, but from next week we should start decomming the old hardware on Buster.

Change #1200030 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Re-enable monitoring for maps/bookworm

https://gerrit.wikimedia.org/r/1200030

Change #1200030 merged by Muehlenhoff:

[operations/puppet@production] Re-enable monitoring for maps/bookworm

https://gerrit.wikimedia.org/r/1200030

Not sure if this font issue T408884 is related, but it was reported around the switch to the new services, so might be worth double checking if the k8s images have the full font stack that we usually use on media servers.

Not sure if this font issue T408884 is related, but it was reported around the switch to the new services, so might be worth double checking if the k8s images have the full font stack that we usually use on media servers.

@TheDJ thanks for reporting! In theory tegola and kartotherian, running on k8s, have been the same for ages, we just switched the postgres databases. I'll try to follow up in the task, but I'd need more data about what you mean with "media servers" to track down the problem.

Change #1201077 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add separate role for single-node staging DB

https://gerrit.wikimedia.org/r/1201077

Change #1185048 abandoned by Muehlenhoff:

[operations/puppet@production] maps/bookworm: Re-enable monitoring

Reason:

Duplicate, variant already merged

https://gerrit.wikimedia.org/r/1185048

We'll keep maps-test2001 around as a separate staging system (single PG master, no replicas, a separate role::maps::staging will be used (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1201077)) to eventually point the wikikube staging endpoint to it. This will allow us to test changes like https://gerrit.wikimedia.org/r/c/mediawiki/services/kartotherian/+/1201020 or https://phabricator.wikimedia.org/T407491 more systematically w/o risk to the main production setup.

It's an old Ganeti node, but it should be good for the remainder of the FY and we can refresh it in the next one.

Change #1201077 merged by Muehlenhoff:

[operations/puppet@production] Add separate role for single-node staging DB

https://gerrit.wikimedia.org/r/1201077

Change #1201690 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Switch maps-test2001 to maps::staging

https://gerrit.wikimedia.org/r/1201690

Change #1188345 abandoned by Muehlenhoff:

[operations/puppet@production] Enable tile invalidation for the new maps nodes in codfw

Reason:

Duplicate, different patch was merged

https://gerrit.wikimedia.org/r/1188345

Mentioned in SAL (#wikimedia-operations) [2025-11-05T10:06:22Z] <moritzm> disabling Puppet on buster maps nodes for pending decom T381565

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: maps2010.codfw.wmnet

  • maps2010.codfw.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Downtimed management interface on Alertmanager
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: maps2005.codfw.wmnet

  • maps2005.codfw.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Downtimed management interface on Alertmanager
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change #1202176 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove old maps nodes from site.pp and Hiera

https://gerrit.wikimedia.org/r/1202176

Change #1202176 merged by Muehlenhoff:

[operations/puppet@production] Remove old maps nodes from site.pp and Hiera

https://gerrit.wikimedia.org/r/1202176

Change #1202664 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] osm_replica: Remove support for pre Bookworm

https://gerrit.wikimedia.org/r/1202664

Change #1202686 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove legacy maps roles

https://gerrit.wikimedia.org/r/1202686

Change #1202687 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove kartotherian-admin group

https://gerrit.wikimedia.org/r/1202687

Change #1202689 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove maps-admins group

https://gerrit.wikimedia.org/r/1202689

Change #1202686 merged by Muehlenhoff:

[operations/puppet@production] Remove legacy maps roles

https://gerrit.wikimedia.org/r/1202686

Change #1202687 merged by Muehlenhoff:

[operations/puppet@production] Remove kartotherian-admin group

https://gerrit.wikimedia.org/r/1202687

Change #1202689 merged by Muehlenhoff:

[operations/puppet@production] Remove maps-admins group

https://gerrit.wikimedia.org/r/1202689

Change #1203013 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] preseed: Remove old maps nodes

https://gerrit.wikimedia.org/r/1203013

Change #1203013 merged by Muehlenhoff:

[operations/puppet@production] preseed: Remove old maps nodes

https://gerrit.wikimedia.org/r/1203013

Change #1203023 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Fix Cumin aliases for maps following removal of buster nodes

https://gerrit.wikimedia.org/r/1203023

Change #1203023 merged by Muehlenhoff:

[operations/puppet@production] Fix Cumin aliases for maps following removal of buster nodes

https://gerrit.wikimedia.org/r/1203023

Change #1199005 merged by Muehlenhoff:

[operations/puppet@production] osm_master: Remove support for pre Bookworm

https://gerrit.wikimedia.org/r/1199005

Change #1203390 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Fix cumin alias for maps

https://gerrit.wikimedia.org/r/1203390

Change #1203390 merged by Muehlenhoff:

[operations/puppet@production] Fix cumin alias for maps

https://gerrit.wikimedia.org/r/1203390

Change #1202664 merged by Muehlenhoff:

[operations/puppet@production] osm_replica: Remove support for pre Bookworm

https://gerrit.wikimedia.org/r/1202664

Change #1203835 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] Turn paging on for kartotherian

https://gerrit.wikimedia.org/r/1203835

Change #1204844 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Fix alias

https://gerrit.wikimedia.org/r/1204844

Change #1204844 merged by Muehlenhoff:

[operations/puppet@production] Fix alias

https://gerrit.wikimedia.org/r/1204844

Change #1204900 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Properly rename tilerator_pass variable

https://gerrit.wikimedia.org/r/1204900

Change #1204913 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[labs/private@master] Remove a lot of historical stub secrets

https://gerrit.wikimedia.org/r/1204913

Change #1204914 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove the new unused tilerator_pass

https://gerrit.wikimedia.org/r/1204914

Change #1204916 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove obsolete grants file

https://gerrit.wikimedia.org/r/1204916

Change #1205007 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[labs/private@master] Fix secret name

https://gerrit.wikimedia.org/r/1205007

Change #1205007 merged by Muehlenhoff:

[labs/private@master] Fix secret name

https://gerrit.wikimedia.org/r/1205007

Change #1204913 merged by Muehlenhoff:

[labs/private@master] Remove a lot of historical stub secrets

https://gerrit.wikimedia.org/r/1204913

Change #1205089 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove tilerator-admin group

https://gerrit.wikimedia.org/r/1205089

Change #1204916 merged by Muehlenhoff:

[operations/puppet@production] Remove obsolete grants file

https://gerrit.wikimedia.org/r/1204916

Change #1205089 merged by Muehlenhoff:

[operations/puppet@production] Remove tilerator-admin group

https://gerrit.wikimedia.org/r/1205089

Change #1201690 merged by Muehlenhoff:

[operations/puppet@production] Switch maps-test2001 to maps::staging

https://gerrit.wikimedia.org/r/1201690

Change #1169636 abandoned by Alexandros Kosiaris:

[operations/puppet@production] DNM: Prep patch for removal of old maps roles

Reason:

No longer applicable.

https://gerrit.wikimedia.org/r/1169636

Change #1204900 merged by Muehlenhoff:

[operations/puppet@production] Properly rename tilerator_pass variable

https://gerrit.wikimedia.org/r/1204900

Change #1204914 merged by Muehlenhoff:

[operations/puppet@production] Remove the now unused tilerator_pass

https://gerrit.wikimedia.org/r/1204914

Change #1128891 merged by Muehlenhoff:

[operations/puppet@production] osm_replica: Fix Hiera variable

https://gerrit.wikimedia.org/r/1128891

Change #1211606 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] kubernetes: add maps-staging-codfw IPs

https://gerrit.wikimedia.org/r/1211606

Change #1211606 merged by Elukey:

[operations/puppet@production] kubernetes: add maps-staging-codfw IPs

https://gerrit.wikimedia.org/r/1211606

Change #1211621 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] maps::osm_replica: Explicitly pass the replication password

https://gerrit.wikimedia.org/r/1211621

Change #1211625 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[labs/private@master] Update secrets for tilerator->tegola rename

https://gerrit.wikimedia.org/r/1211625

Change #1211625 merged by Muehlenhoff:

[labs/private@master] Update secrets for tilerator->tegola rename

https://gerrit.wikimedia.org/r/1211625

Change #1211621 merged by Muehlenhoff:

[operations/puppet@production] maps::osm_replica: Explicitly pass the replication password

https://gerrit.wikimedia.org/r/1211621

Change #1211684 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[labs/private@master] Remove obsolete stub secrets

https://gerrit.wikimedia.org/r/1211684

MoritzMuehlenhoff claimed this task.

This is complete. Luca and myself made a total of 122 commits to puppet.git (plus surely a few where me missed to tag the task) for:

  • After initial tests on repurposed old Ganeti nodes: An installation on new servers using standard server types (and reduced from six to four nodes per site due to more powerful specs) was made and the old nodes decommissioned
  • Karthotherian was moved to Wikikube and the new nodes no longer have any traces of it, allowing all future updates to be less complex
  • The new nodes use Bookworm instead of Buster, refreshing the OS stack in many regards, e.g. moving Postgresql from 11 to 15 and postgis from 3.1 to 3.5. This required several fixups, e.g.
    • Postgresql is now stricter in the configuration of replication slots and max_wal_senders needs to be identical on masters and replicas.
    • Grants needed to updated: On Postgres 15 it is no longer allowed by default to create tables in the public namespace.
    • Starting with Postgres 15 the hashing changed from md5 (for which we could precompute the hash in the postgresql user defines) to scram-sha1, which gets salted with a value we've so far been unable to extract to compare the hash in Puppet, so password changes are currently not supported until T326325 is resolved. To address that, these are now configured as part of the initial OSM import.
  • Installed with UEFI, sotftware RAID and standardised Partman recipes (the old nodes used hardware RAID)
  • Migrated to nftables as the firewall provider
  • We fixed the setup to allow the nodes to have proper AAAA records in DNS. This also needed configuration tweaks in Postgresql since the grants are only based on IPV4 addresses.
  • We updated imposm to the latest version (and tracked down a data corruption bug which got introduced in October via new OSM updates)
  • The Swift script were refactored (https://github.com/wikimedia/operations-puppet/commit/e3ebe216c19ce604a306b3255a7c1bd4b47d9af36)
  • We tweaked the Postgres performance settings to increase max-conns and the shared buffer size since we have beefier hardware and increase performance
  • Some slow queries were idenfified (to be indexed when we have a staging host): T407491

Lots of things were also cleaned up in Puppet:

Change #1211684 merged by Muehlenhoff:

[labs/private@master] Remove obsolete stub secrets

https://gerrit.wikimedia.org/r/1211684

Change #1212062 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] postgis: Remove support for buster

https://gerrit.wikimedia.org/r/1212062

Change #1212062 merged by Muehlenhoff:

[operations/puppet@production] postgis: Remove support for buster

https://gerrit.wikimedia.org/r/1212062

Change #1212099 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/cookbooks@master] sre.maps.roll-restart-reboot-master: Adapt to Bookworm changes

https://gerrit.wikimedia.org/r/1212099

Change #1212100 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/cookbooks@master] sre.maps.roll-restart-reboot: Adapt for Bookworm

https://gerrit.wikimedia.org/r/1212100

Change #1212099 merged by Muehlenhoff:

[operations/cookbooks@master] sre.maps.roll-restart-reboot-master: Adapt to Bookworm changes

https://gerrit.wikimedia.org/r/1212099

Change #1212100 merged by Muehlenhoff:

[operations/cookbooks@master] sre.maps.roll-restart-reboot: Adapt for Bookworm

https://gerrit.wikimedia.org/r/1212100