Page MenuHomePhabricator

Create a new bucket for Tegola's tile cache and duplicate its data
Open, Needs TriagePublic

Description

Hi!

As part of the parent task, we are working on moving maps servers on Bookworm. They run postgres, that in turn it is used by Tegola to fetch data and render its tile cache on Thanos Swift (without it, we'll not be able to sustain the current traffic).

The current user is tegola:prod, and we are currently storing ~350M tile objects, for a total storage of 450G in each of the following buckets: tegola-swift-codfw-v002 and tegola-swift-eqiad-v002 (one for each DC).

We'd need to do the following:

  1. create two new buckets, one for each DC, called tegola-swift-codfw-v003 and tegola-swift-eqiad-v003 (easy enough with s3cmd).
  2. Use the new Postgres cluster on maps-test2* to regenerate the tile cache, since we want to make sure that we can re-render everything with the new setup. This will effectively double the current capacity used in each Thanos Swift cluster.
  3. Eventually we'll drop all the data in the old buckets, when we'll feel ready (we do want to have a fallback if needed).

The old buckets will not receive new data.

Is it something that we can do right now? Or are we at capacity on Thanos Swift and it would be preferrable to drop the data first, and then warm up the cache to generate the new one?

Event Timeline

Quick question: I'm concerned about the rather vague timeline for deleting tegola-swift-eqiad-v002 and tegola-swift-codfw-v002; it'd be easier to be relaxed about your proposal if there was a timeline for going back to current-ish usage rather than x2 usage...?

To put a little more context on that:

root@thanos-fe1004:/home/mvernon# for b in $(swift list); do swift stat --lh  "$b" | grep -E 'Container|Bytes' ; done
                    Container: tegola-swift-codfw-v002
                        Bytes: 82G
                    Container: tegola-swift-codfw-v003
                        Bytes: 0
                    Container: tegola-swift-container
                        Bytes: 59G
                    Container: tegola-swift-eqiad-v002
                        Bytes: 87G
                    Container: tegola-swift-eqiad-v003
                        Bytes: 0
                    Container: tegola-swift-fallback
                        Bytes: 38G
                    Container: tegola-swift-new
                        Bytes: 46G
                    Container: tegola-swift-staging-container
                        Bytes: 32G
                    Container: tegola-swift-v001
                        Bytes: 71G

So of ~419G data in that account, 169G is in containers that I infer are actually currently in use, and the remaining 250G is old data?

@Jgiannelos @MSantos Hi! My understanding is that Tegola is now using tegola-swift-codfw-v002 and tegola-swift-eqiad-v002, but the other containers are old data that we can drop. More specifically:

  • tegola-swift-fallback
  • tegola-swift-new
  • tegola-swift-v001
  • tegola-swift-container

What do you think? Adding also @jijiki since she worked on Tegola in the past and might provide more context on those names.

...so ideally, delete all the old data and then you can just go ahead (and maybe let's make a rough plan for "when to delete v002"?) :)

...so ideally, delete all the old data and then you can just go ahead (and maybe let's make a rough plan for "when to delete v002"?) :)

We should be able to delete v002 once the new maps hardware has arrived in the next Q, and both the eqiad/codfw are reimaged to Bookworm (v003) and resynced from OSM, Sep-Oct likely.

I double checked:

  • eqiad uses
bucket = "tegola-swift-eqiad-v002"
  • codfw uses
bucket = "tegola-swift-codfw-v002"
  • staging
bucket = "tegola-swift-staging-container"

The rest are not in use.

@MatthewVernon @MoritzMuehlenhoff I am planning to do the following:

  • log on thanos-fe1004
  • sudo su; source /etc/swift/account_AUTH_tegola.env

And then:

swift delete tegola-swift-container
swift delete tegola-swift-fallback
swift delete tegola-swift-new
swift delete tegola-swift-v001

Does it sound ok? Do I need to do it in codfw too, or replication will take care of it? I guess the former but I'd like to be sure.

Change #1160688 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] role::maps::master: fix Tegola container name

https://gerrit.wikimedia.org/r/1160688

FWIW, I use sudo bash ; . /etc/swift/accountfile.env, but yes. Those commands will take some time to run I expect, so run them in a screen/tmux. The thanos-swift cluster is one cluster stretched between both DCs, so those deletes will remove content from both DCs.

Silly question while I'm here - do you need 2 buckets, each of which ends up replicated cross-DC? [The answer might be yes, but it seemed worth checking]

Change #1160688 merged by Elukey:

[operations/puppet@production] role::maps::master: fix Tegola container name

https://gerrit.wikimedia.org/r/1160688

Mentioned in SAL (#wikimedia-operations) [2025-06-18T12:43:27Z] <elukey> drop old Thanos Swift's Tegola tile cache containers - T396584

Silly question while I'm here - do you need 2 buckets, each of which ends up replicated cross-DC? [The answer might be yes, but it seemed worth checking]

Far from silly, I am not totally sure. In theory every DC has its own separate stack, and as long as the cache is properly replicated within its DC we shouldn't need any extra replication. @Jgiannelos what do you think? Would it make sense to avoid replicating all the Tegola tiles from the eqiad swift cluster to the codfw one (and vice-versa) ?

@MatthewVernon would it be ok to start the upload of the new tiles to Swift, while we are removing the other ones? I am using the default concurrency for the swift delete but it is going to take weeks to clean up everything.

We could also bump the concurrency used by swift delete (10 at the moment), to something like 100, what do you think?

@elukey yes, that should be fine to start upload - ping me when the deletion is done, please?

I'm wary of upping the concurrency unless we have to, to avoid too much load on the thanos-swift frontends.

@elukey yes, that should be fine to start upload - ping me when the deletion is done, please?

I'm wary of upping the concurrency unless we have to, to avoid too much load on the thanos-swift frontends.

@MatthewVernon I agree but we could start and measure it, in case of slowness we can back off.. Deleting millions of objects at this rate would take days or weeks, in the past 12/15 hours I was able to delete 2M objects and there are 48M to go only (not in total, only for the current target container).

@elukey You could delete each container in parallel (in a separate tmux/screen window or whatever)? That would get us some more parallelism, but less aggressively than bumping the concurrency in swift delete. I'm relaxed about it taking a few weeks, but can see that having it all under way rather than waiting weeks and then starting the next container going would be annoying...

Silly question while I'm here - do you need 2 buckets, each of which ends up replicated cross-DC? [The answer might be yes, but it seemed worth checking]

Far from silly, I am not totally sure. In theory every DC has its own separate stack, and as long as the cache is properly replicated within its DC we shouldn't need any extra replication. @Jgiannelos what do you think? Would it make sense to avoid replicating all the Tegola tiles from the eqiad swift cluster to the codfw one (and vice-versa) ?

@Jgiannelos may verify, to my knowledge those are two completely separated stacks, with no cross-replication enabled what so ever. Last time we re-created thanos-swift was not a option, but maybe it is something we could discuss again?

Historically there were many cases where maps had issues which led to stale caches and or needing to switchover to a single DC.
I think the reason we chose this is to be able to have 2 separate stacks and to be able to isolate with relative fresh tiles in case something happens.

I don't think we necessarily need both buckets replicated in both DCs. We can potentially have one bucket per DC

Historically there were many cases where maps had issues which led to stale caches and or needing to switchover to a single DC.
I think the reason we chose this is to be able to have 2 separate stacks and to be able to isolate with relative fresh tiles in case something happens.

I don't think we necessarily need both buckets replicated in both DCs. We can potentially have one bucket per DC

Luca and myself dicussed this on IRC: for the current bookworm refresh we'll stick with the status quo. When we move to APUS/S3 (context at https://phabricator.wikimedia.org/T395659) at a later step, we can revisit the replication

We are still dropping old buckets, it takes a really long time but I have a tmux session on thanos-fe1004 that is doing it. I'll report when done!

Updated deletion list:

tegola-swift-codfw-v002
tegola-swift-eqiad-v002
tegola-swift-staging-codfw-v001
tegola-swift-v001

Coming back again for the deletion. Using the swift command is a lot of pain, so I tried s3cmd from stat1010 and I have to say that it is way quicker and better. I am deleting the -v002 buckets as we speak, the last ones remaining. After they are completed I'll add some documentation on Wikitech about how to quickly and properly clean up old buckets when needed.

I think it will take ~2 weeks to delete tegola-swift-codfw-v002 and other ~2 weeks for the eqiad variant, I'll update the task once done.

In the meantime, I created https://wikitech.wikimedia.org/wiki/Maps/v2/Common_tasks#Cleanup_old_Swift_buckets to have a reference about how to do the cleanup.

Next steps:

  • Wait for the deletion of tegola-swift-codfw-v002 and tegola-swift-eqiad-v002

Thanks for doing this and documenting the approach :)