Prior to moving to production, we should benchmark (and document) the performance of the session storage service under mixed workloads. This information should prove invaluable in validating the implementation, providing a baseline for any subsequent optimizations, and in future capacity planning.
Environment
- Cluster
- sessionstore1001
- sessionstore1002
- sessionstore1003
- sessionstore2001
- sessionstore2002
- sessionstore2003
Each machine is a dual Intel Xeon Silver 4110 2.1G (8C, 16T) w/ 64G RAM, 2 @ 128G SSDs, and 1 gbit NIC.
Every node runs a single instance of Cassandra 3.11.2 (6 node cluster). All Cassandra data (commitlog, sstables, etc) shares a RAID-1.
Kask is run from screen session on sessionstore1001, port 8080.
wrk is executed from sessionstore1002. A Lua script is used to create a randomized mixed workload from a pregenerated JSON-formatted file:
$ wrk --latency -t8 -c2048 -d10m -s multi-request-json.lua http://sessionstore1001.eqiad.wmnet:8080 ...
Results
Threads | Concurrency | Size (k/v) | Ratio (r/w) | Throughput | 50p latency | 99p latency | Errors |
---|---|---|---|---|---|---|---|
8 | 1024 | 8/16 | 100:1 | 52610/s | 20.76ms | 39.01ms | 0 |
8 | 2048 | 8/16 | 100:1 | 71899/s | 37.94 | 245.18ms | 620 (0.001%) |
8 | 1024 | 32/128 | 100:1 | 52343/s | 21.75ms | 40.50ms | 0 |
8 | 2048 | 32/128 | 100:1 | 67877/s | 38.33ms | 228.67ms | 2160 (0.005%) |
8 | 1024 | 32/2048 | 100:1 | 34641/s | 38.28ms | 140.04ms | 0 |
8 | 2048 | 32/2048 | 100:1 | 33892/s | 48.97ms | 376.76ms | 0 |
8 | 3072 | 32/2048 | 100:1 | 34832/s | 72.68ms | 572.14ms | 773987 (3.7%) |
Caveats, Observations & Comments
- Errors in all cases above were the result of Cassandra timeouts
- No effort has (yet) been made to tune Cassandra for this workload
- Kask and wrk are co-located on two of the Cassandra nodes; Resource contention between the database, application, and benchmark-er influences the results
- As noted in the comments, latency distribution would suggest that something adds ~40ms to ~1/2 of requests (see Figure 1)
Figure 1: Latency distributions |