DateTieredCompactionStrategy (DTCS) is not working as expected, its optimizations are being defeated in our environment(s) by out-of-order writes (see T126221: Evaluate efficacy of DateTieredCompactionStrategy, for background). An alternative to DTCS has emerged in the form of TimeWindowCompactionStrategy (TWCS), which eschews tiering in favor of creating fixed windows of time.
Since time-ordered data models are common in our environment(s), I believe TWCS warrants an investigation.
Status
Tables that have been converted to date.
Conversion date | Tables | |
---|---|---|
2016-10-12 | local_group_wiktionary_T_parsoid_html.data | SSTables/read, SSTable count (large spikes are the result of repair testing) |
2016-10-13 | local_group_wikimedia_T_parsoid_html.data | SSTables/read, SSTable count |
2016-10-19 | local_group_wikipedia_T_parsoid_html.data | SSTables/read. SSTable count |
2016-10-27 | local_group_*_T_mobileapps_{lead,remaining}.data | |
2016-11-07 | local_group_*_T_title__revisions.{data,idx_by_rev_ever} | |
Tombstone GC
One of the primary hopes for TWCS, was that when combined with repair, out-of-order writes could be confined to the STCS-compacted current window. For repairs completed within the compaction_window_size, overlap could be largely eliminated before the outgoing window's major compaction, and droppable tombstones thus kept to a minimum. That may continue to be an option for some subset of tables, but testing conducted as part of T113805 would suggest it is unlikely that we have the overhead to complete frequent repairs of more than a subset of our data.
However, starting in the comments here, are the results of user-defined compaction tests that collapse the N oldest windows, resulting in significant collection of tombstones. User-defined compactions of this kind would be straightforward to script, are not impactful, and have the added benefit of bounding the number of SSTables (and as a result, the SSTables/read).
Update: 2016-11-29
A proof-of-concept script has been put in place, running out of my (@Eevans) crontab, that performs the user-defined compactions described above (as a limited trial). We should revisit this in a month or so, and if it continues to produce favorable results (and no better alternative presents itself in the meantime) then we can properly operationalize this.