As a sub-task of T120171, this task discusses steps towards storing only current revisions in a reliable, low-maintenance and low-latency manner.
## Option 1: Avoid tombstones with separate current revision & ttl tables
### Table schemas
```lang=sql
CREATE TABLE latest.data (
“_domain” varchar,
title varchar,
revision int,
tid timeuuid,
html text,
data_parsoid text,
section_offsets text,
PRIMARY KEY (“_domain”, title)
);
CREATE TABLE edit_checkout.data (
“_domain” varchar,
time_window bigint,
title varchar,
revision int,
tid timeuuid,
html text,
data_parsoid text,
section_offsets text,
PRIMARY KEY ((“_domain”, time_window, title, revision), tid)
);
```
### Algorithm
Latest content is always overwritten in the latest table with no TTL. To protect from the race conditions Cassandra write timestamp is set to the time of the edit (delivered to us in the `If-Modified-Since` header). The write timestamp have a one-second resolution, so we can numerical add the revision number to the write time for more protection agains possible races.
In case an update comes and a new render becomes the latest render, the following procedure is applied:
- Current latest content is copied to the `edit_checkout` table with a TTL of 24 hours.
- The new render (revision) is written to the `latest` table, overwriting the previous one.
This ensures that any ongoing edits that were using the previous content of the `latest` table will succeed because the content they depend on is stored for another 24 hours.
If the edit is made to an older revision, we check the `edit_checkout` table if we have that revision in there and potentially renew the TTL. If the older revision is not in storage, it's generated by Parsoid and stored in the `edit_checkout` table.
### Implementation considerations
This could effectively be implemented as a revision-retention-policy in the scope of `restbase-mod-table-cassandra` module. This could have 3 modes of operation:
- If `grace_ttl=0` it works as a key_value storage always overwriting the newer content. We can just create the latest table and avoid creating the checkout table.
- If the policy is TTL we just create the checkout table.
- Mixed mode - we create both tables.
If all use-cases for a revision-retention-policy could be fit into these 3 options, we can remove the revision retention policy that we have right now completely.
### Open questions
- Should HTML and data-parsoid be stored together or in the separate tables? What's the performance implications of this? Whats the complexity overhead of separating them?
- Should we just set the TTL globally on the `edit_checkout` table?
## Option 2: Discarded. Keep current schema, change retention policy.
### Design changes from current schema
- Switch to "latest hash" revision retention policy, which keeps only one one render of the latest revision. While the current implementation is already quite close to what we need, there are some changes we would need to make to this policy:
- Use TTLs instead of plain deletions whenever grace_ttl is > 0.
- The regular "latest hash" policy still keeps outdated renders for a TTL, to guarantee that edits can complete. There is however still a chance that edits based on an almost-expired entry would fail due to expiry. To avoid this, we can consider adding some "TTL extension" logic, which would re-write such content on read access to make sure that edits based on TTLed content will complete. This could be driven by a `grace_ttl_renew_threshold` configuration variable, which would trigger rewrites with a new TTL lease for items that have less TTL than this value remaining. To communicate whether something is a bona fide stored item or a TTLed one, TTLed responses could expose their remaining lifetime in a header (cache-control, for example). Higher level modules like parsoid could then use this to request corresponding content (like data-parsoid) when needed to ensure its TTL extension, before returning old HTML. As a result of all this, edits based on old revisions still in TTLed storage are also guaranteed to finish.
- Optimize the compaction strategy for the new update pattern. With the small dataset (<500G total) and different update pattern, TWCS might not be the best choice.
### Migration strategy
* Enable content versioning filter: Re-renders all content that is not up to spec.
* Truncate html / data-parsoid tables
* Truncate HTML first, to avoid data-parsoid loss issues. However, loss of original HTML would still be an issue for selective serialization.
* Currently, pretty sure we return errors when original HTML is missing, which means that outstanding errors could not be saved.
* Temporarily change this behavior to ignore loss of original HTML & accept a few dirty diffs. Number of edits affected likely small.
* Alternative: Disable VE during transition. Would have to be for longer than most edits, though (hours?).
* Enable "latest hash" revision retention policy with TTL extension logic.
* Adjust compaction strategy.
* Collect data on performance over next months. Should give us decent idea of "current" table performance.
See [this document](https://docs.google.com/document/d/1qd8XilG5Jt0TRm5mMEokCG6d0_DkyReCi85KKlh-i8c/edit#heading=h.m7ioe5euvcac).
## See also
- {T156209}