Up until now we are pregenerating all tiles on each OSM diff sync on all zoom levels (deduplicated).
We now know that only a subset of the planet can prepare a pretty hot cache (>95% hits) and lead to pretty low latency (p99 < 1s).
An idea to improve the maps data pipeline is instead of pregenerating all OSM expired tiles is instead to:
- Extract all keys from swift (z/x/y) tiles using the swift cli
- Generate a tileset from the intersection of the expired tiles set and the cache tileset
- Send expiration events only from this subset
This can even be done with basic bash commands in our current scripts.
From a quick check I did for a random OSM sync run this reduces the tiles to be pregenerated from ~5million tiles (6 hours of pregeneration) to ~800k tiles (~1 hour of pregeneration).