We are currently discussing several schema changes that separate permanent storage from TTL'ed temporary storage. The motivation for this change is to get efficient compaction, where Cassandra can drop entire expired SSTables, rather than compacting them with other content.
Before we can rely on this, we should establish
- what the requirements are to benefit from efficient and timely SSTable expiry (ex: per-table TTL vs. per-row), and
- a characterization of a) how timely the expiry process can be, and b) the expected write amplification from intermediate compactions before reaching the (eventually expired) final SSTable.
As a result, we should have a better understanding of the performance gains we will get from this setup, compared to a mixed-storage setup. We have some use cases that could potentially become feasible with efficient expiring storage, but need a better quantitative understanding in order to be able to compare Cassandra to alternatives.