Our data model for HTML content does not distinguish between low-latency high-volume access to current revisions and long-term archival. This leaves some room for optimization for each of those two use cases.
Compression ratios for HTML content are currently at around 15%16% of the input size. Since the changes between revisions are actually small, ratios in the low single-digit percent ought to be possible. The main issue preventing this for HTML is currently the deflate window size of 32k not picking up repetitions between revisions of articles larger than 32k, which are relatively common. We could add brotli support to Cassandra to get larger windows, but to exploit this for whole-article storage we would then need to use extremely large input block sizes to pick up a decent number of repetitions. This in turn would likely make reads slower and more memory intense.
A better option to reduce storage needs mightcould be to chunk content, ideally in alignment with semantic blocks like top-level sections in HTML. If these chunks are smaller than 32kthe compression algorithm's window size (32k for deflate), then deflateit will pick up repetitions between chunks. Additionally, most edits only affect a single chunk in a large document. We couldan skip adding new versions of unchanged chunks altogether, which also reduces the write load. The first chunk should normally load more quickly than an entire document, Finally,reducing the time to first byte. the first chunk should normally load more quickly than an entire documentThere is also growing demand for section-based content loading at the API level, reducwhich can be efficiently supported by storing the time tocontent in sections in the first byteplace.
Another consideration is a separation of hot from cold storage,The deflate / gzip window size of 32k is likely smaller than what we'd pick for an optimal trade-off between number of IOs and compression block size needed to pick up a decent number of repetitions. so that weAdding Brotli support in Cassandra can replicate hot data to the edge,give us a wider range of options for this trade-off. but keep cold archival data only in two DCs and possibly on more density-optimized hardware.However, Wwe can do this relatively easily by storing current revisions in a key-value bucketget started with deflate & leverage Brotli in addition to archival storage later iteration.
There is also room for optimization for high-volume access to current revisions.Another consideration is a separation of hot from cold storage, so that we can replicate hot data to the edge, but keep cold archival data only in two DCs and possibly on more density-optimized hardware. We can do this relatively easily by storing current revisions in a key-value bucket in addition to archival storage. Within this bucket, A good option for high-volume access to current revisions might be towe can store each revisions as an individually gzip compressed blob, ready to be streamed to the client without any rede/re-compression overheads. The main gains in this scheme should come from the lack of extra reads and computation in Cassandra, as well as avoiding the need to compress data on the way out.