Swift has been proposed before as an alternative backend to store maps tiles. It would make sense to revisit this option. Swift is specialized in storing unstructured data (blobs) where postgresql is more oriented toward structured relational data.
Description
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | ssastry | T263854 [Maps] Modernize Vector Tile Infrastructure | |||
| Resolved | Jgiannelos | T196474 Externalize tile storage for maps | |||
| Resolved | Jgiannelos | T149885 Investigate Swift as a storage backend for maps tiles |
Event Timeline
What sorts of traffic and object size/numbers are we talking about for swift?
swift in codfw could be used for tests for this, it has the same specs as the eqiad swift cluster.
For historical perspective, when we were working on initial design, I did investigate Swift. I decided against it because of its reputation for being slow and because we didn't need its killer feature, the ability to cheaply stream large files. Postgres wasn't an option because I wasn't sure how many tilesets we would need to store, so I felt that an ability to store a dataset larger than one box has space for was needed. This was easily doable in Cassandra by manipulating replica count.
Now that we know our space requirements are still low, we can investigate our options further. What are out requirements besides not being Cassandra and not using JVM?
I'd say a minimum of durability, and being "production ready" of course. Speaking HTTP is a plus IMO because it is easy to interact with and can serve users more or less directly.
On swift itself, I'm assuming tiles can and will be refreshed at will. If the dataset is big enough and can be regenerated we can also consider storing it at 2x replication to save some space (after T151648: Implement storage policies for swift is done, that is)
This has to be considered and you can find out more about the investigation at T272843#6822224.
A decision to proceed with Swift as tile storage has been made and you can check the Architectural Decision Record at https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Kartographer/+/665092/4/adr/tile_storage.md
Please re-open or reach out if you have any questions or considerations.