Page MenuHomePhabricator

Cassandra compaction is getting behind
Closed, ResolvedPublic

Description

Since RESTBase's release, the number of pending compactions in Cassandra has been trending upward. This will eventually cause problems.

Throughput is currently set to the default of 16MB/s, which is quite conservative for our environment, particularly given our use of Leveled Compaction.

If there are no objections, I'm going to start gradually increasing it (ephemerally, using nodetool), and observe the results.

cassandra_pending_compactions-00.png (656×853 px, 59 KB)

Event Timeline

Eevans claimed this task.
Eevans raised the priority of this task from to High.
Eevans updated the task description. (Show Details)
Eevans added subscribers: Eevans, fgiunchedi, akosiaris.
$ for host in 1 2 3 4 5 6; do echo -n "restbase100$host: "; nodetool -h restbase100$host getcompactionthroughput; done
restbase1001: Current compaction throughput: 24 MB/s
restbase1002: Current compaction throughput: 24 MB/s
restbase1003: Current compaction throughput: 24 MB/s
restbase1004: Current compaction throughput: 24 MB/s
restbase1005: Current compaction throughput: 24 MB/s
restbase1006: Current compaction throughput: 24 MB/s
$ for host in 1 2 3 4 5 6; do echo -n "restbase100$host: "; nodetool -h restbase100$host 

fyi you can do for i in {1..6} :)

Compaction throughput is now set at 128 MB/s, but with only 2 compaction threads, actual throughput seems to be limited to about ~50MB/s. Pending compactions seems to be leveling off a bit though, so I'm going to leave everything as-is for the evening, and look into raising concurrency tomorrow.

cassandra_pending_compactions-01.png (502×1 px, 54 KB)

After letting it run over night, it would seem that 1-4 are in OK shape, steadily trending down, but 5 and 6 (the two nodes involved in the recent bootstrap operation) are headed for trouble.

I'll put together a patch for configuring a larger compaction thread pool.

cassandra_pending_compactions-02.png (444×1 px, 59 KB)

Change 197911 had a related patch set uploaded (by Eevans):
overrid-able concurrent_compactors setting

https://gerrit.wikimedia.org/r/197911

Change 197915 had a related patch set uploaded (by Eevans):
increase compaction throughput and concurrency

https://gerrit.wikimedia.org/r/197915

thanks for looking into this! what are reasonable thresholds we should alert on?

Change 197911 merged by Ori.livneh:
overrid-able concurrent_compactors setting

https://gerrit.wikimedia.org/r/197911

Change 197915 merged by Faidon Liambotis:
increase compaction throughput and concurrency

https://gerrit.wikimedia.org/r/197915

Update: pending compactions are now all trending downward. Monitoring will continue.

cassandra_pending_compactions-03.png (500×1 px, 88 KB)

Change 198781 had a related patch set uploaded (by Eevans):
increased compaction concurrency and throughput

https://gerrit.wikimedia.org/r/198781

Change 198781 merged by Gage:
increased compaction concurrency and throughput

https://gerrit.wikimedia.org/r/198781

We also enabled trickle_fsync, which made a big difference to latency under heavy write load by writing changes out continuously rather than waiting for the VM subsystem to flush dirty pages in big bursts. We can now run full compactions with no noticeable request latency impact & iowait limited to around 1%.

Overall it looks like the new settings have given us a bit of space to keep up with compactions without affecting latency. We are however also seeing typical signs of more storage per instance not being such a good idea (GC primarily), so are looking into setting up multiple instances per hw node. See T93790: Expand RESTBase cluster capacity for a discussion of the options.

@Eevans, should we close this task now, or should we keep it open until we have an alert for the compaction backlog?

@GWicke, I think it should be closed; The original issue is for all intents solved. We can track the threshold alert in T78514, or create a new one.

Eevans updated the task description. (Show Details)
Eevans set Security to None.