Page MenuHomePhabricator

Compare resources needed for Redis and Swift storage in order to replace Cassandra.
Closed, ResolvedPublic

Description

Currently, as part of refactoring the maps infrastructure, we are considering moving away from Cassandra to store tiles and use Redis as a cache for tegola.
In order to check our current available resources and potentially scale them down it is useful to compare the size needed for storage between Cassandra and Redis.

Open Questions
  • Is Swift a good candidate for tile storage?
Acceptance Criteria
  • Benchmark Cassandra as a baseline
  • Benchmark Redis for tile storage
  • Evaluate whether Swift can be used and benchmark if possible

Event Timeline

I ran the tile pregeneration in our current stack (tilerator + cassandra)

  • initialized postgres with a pbf file ~80 MB
  • spawned a tile pre-generation job for
    • zoom range: 0 - 10
    • full planet

Here is some generic docker metrics from my environment:
https://snapshot.raintank.io/dashboard/snapshot/RVAGJsWI62Wad8I5QOFogMCMtRByhwI8

The container we investigate is kartodock_cassandra_1
After a node drain + graceful shutdown the persistent storage was ~90 MB

cqlsh> SELECT COUNT(*) FROM v4.tiles;

 count
--------
 856728

(1 rows)

I ran then same benchmark for tegola backed by redis

  • Same postgres setup, ~80 MB input
  • Spawned a tile pre-generation job for
    • zoom range: 0 - 10
    • full planet

Here is some generic docker metrics from my environment:
https://snapshot.raintank.io/dashboard/snapshot/nytPAe3JrtQXBLbdXCoCFQUQNLOlLadi

The container we investigate is wmf-maps-spikes-redis_1
Total memory used in Redis ~1.5 GB

127.0.0.1:6379> select 0
OK
127.0.0.1:6379> DBSIZE
(integer) 1398101
127.0.0.1:6379>

@MSantos any idea why the full plane on tilerator was ~800k tiles and the same for redis was ~1.3 M ? Are we doing some sort of optimization on tilerator?
@hnowlan Let me know if you need any further insight from the benchmark.

Other than that I think its pretty safe to say that Redis could even need less resources.

Jgiannelos renamed this task from Compare resources needed for Redis storage in order to replace Cassandra. to Compare resources needed for Redis and Swift storage in order to replace Cassandra..Feb 8 2021, 4:16 PM
MSantos triaged this task as High priority.Feb 8 2021, 4:17 PM

Swift

Compatibility

It looks like tegola can be used with Swift using s3api combatibility.
Local setup:

  • Devstack for a local openstack/swift setup
  • Created openstack users with access to swift
  • Created ec2 compatible credentials

The only fix that was required in tegola was: https://github.com/johngian/tegola/commit/4168bc699cbf38269f1ba1861a6f74abf7a0c3b6
The reason behind that is that for s3 it uses a dns based bucket definition when in swift we need to use a path based (eg. http://swifthostname:swiftport:/maps vs https://maps.swifthostname:swiftport/)

Benchmark

Given our restrictions it looks like the tegola tile generation performance is not bound by the software or the tegola host resources but by the request throughput that we would like to push in the swift endpoint API (~tens of req/s).

Just as a reference:

  • From a quick run using openmaptiles it looks like the numbers are similar with redis (~300 ms - ~500 ms) per tile mostly spend on DB asMVT query
Bundled upload
  • Generated tiles on local fs
  • For around ~6.5k tiles I created an archive ~5MB (original size ~25mb). Bundle upload took around ~10mins in a single node setup (4vCPU/8G).
curl -X PUT "http://swift-endpoint/v1/AUTH_<id>/maps-bulk/?extract-archive=tar.gz" 
    -T bundled-tiles.tar.gz
    -H "X-Auth-Token: <token>"
    -H "Content-Type: application/x-gzip"
    -H "X-Detect-Content-Type: true"                                                    

Number Files Created: 6771
Response Body: 
Response Status: 201 Created
Errors:

Change 665092 had a related patch set uploaded (by Jgiannelos; owner: Jgiannelos):
[mediawiki/extensions/Kartographer@master] Add ADR for swift storage

https://gerrit.wikimedia.org/r/665092

Change 665092 merged by jenkins-bot:
[mediawiki/extensions/Kartographer@master] Add ADR for swift storage

https://gerrit.wikimedia.org/r/665092