The current mechanism of cleaning up older renders and revisions in the multi-content-bucket does have a flaw when the TTL for indexes is not enough. Here's the idea for a new method. The idea only talks about the revision timeline, but the render timeline is exactly the same.
Currently, when writing a new revision we make a request to a revision-timeline like this:
SELECT * FROM revision-timeline WHERE _domain = <domain> AND key = <page_title> and ts <= (Date.now() - grace_ttl) LIMIT 1;
If this query gives 404 that either means the relevant index entries have expired or that no revisions have passed the window of recency. This creates a problem when large numbers of revisions actually slip through if they are edited in bulks within the grace_ttl period.
Proposal: change query to
SELECT * FROM revision-timeline WHERE _domain = <domain> AND key = <page_title> and ts >= (Date.now() - grace_ttl) LIMIT 1;
(Note the change in non-equal sign - it was less then, now it’s more then)
(NOTE that this will require a change in the revision-timeline indexing order!)
So this will give us the oldest revision within the current window of recency. If the query gives the result, that means any revision with a number STRICTLY LESS than the oldest one within the window of recency can be deleted, so we issue a range delete with a strict <.
If the query returns 404, that means the index has expired and the revision we are writing in right now is the first one after a long period of inactivity. That means that any revision prior to the previously latest one was superseded more than index_ttl time ago, which means it can be deleted. Currently, on writing a new revision we actually read the previously latest revision, so we know the prev_latest_rev, so we can issue a range delete to drop anything prior to that.
The render timeline will work exactly the same.