Change Details

As a sub-task of T120171, this task discusses steps towards storing current revisions only, in a reliable, low-maintenance, and low-latency manner. ## Option: Retention policies using application-level TTLs {icon star spin color=blue} This approach uses a schema identical to that of the current storage model, one that utilizes so-called wide rows to model a one-to-many relationship between a title and its revisions, and a one-to-many relationship between each revision and its corresponding renders. It differs only in how it approaches retention. Since renders are keyed on a type-1 UUID, retaining a single current render, and (at least) 24 hours worth of past renders, is as simple as batching a range delete with new renders, using a `tid` predicate 24 hours less than the one being inserted. Limiting renders is slightly more challenging since the revision is an integer and no temporal context exists. As a result, additional storage is used to establish this relationship, mapping timestamps to corresponding revisions. Records in this timeline are keyed by domain (on the assumption that mediawiki sharding would never be more granular than this). Updates to the timeline can be performed probabilistically, if necessary. TTLs can be applied to prevent unbounded growth. See https://www.mediawiki.org/wiki/RESTBase/StorageDesign#Retention_policies_using_application-level_TTLs for a more thorough explanation. ## Option: Separate TTL table with ~~rage~~ range deletes on latest content This option is very similar to the previous one, but it avoids the need of the timeline indexing. This approach uses 2 tables with schemas identical to the one used now. The `latest` table with no TTL for the data and the `ttl` table with the table-level TTL of 24 hours. On read, the data is first attempted to be read from he `latest` table, fallback to the `ttl` table with a fallback to generating on demand. If the data was generated on demand - it's written to both latest and `ttl` table. If the data was found in the `ttl` table only, the TTL refreshing might be applied if it's about to expire. (the TTL refreshing simply writes the data in the TTL table again, potentially with a new TID) On update, the following algorithm is applied: - Read the currently latest render form the latest table - Write it to the TTL table - Write a new latest render to the `latest` table - Write the new latest render to the TTL table (to avoid a race condition when 2 concurrent updates read a single render, and then both written new renders one of the new renders would never be stashed in the TTL table) - Apply range deletes for previous renders of the currently written revision and for the previous revisions (if the latest_hash revision policy is used) No explicit deletes are made in the TTL table, the unbounded growth is prevented by the table-level TTL. Open questions: - How would the TTL table behave in this - it receives as many writes as the latest table and the writes might be out of order (for the arbitrary old revision edit case) ## Option: Table-per-query This approach materializes views of results using distinct tables, each corresponding to a query. ### Queries / Tables - The most current render of the most current revision (table: `current`) - The most current render of a specific revision (table: `by_rev`) - A specific render of a specific revision (table: `by_tid`) ### Algorithm Data in the `current` table must be durable, but the contents of `by_rev` and `by_tid` can be ephemeral (should be, to prevent unbounded growth), lasting only for a time-to-live after the corresponding value in `current` has been superseded by something more recent. There are two ways of accomplishing this, either by a) copying the values on a read from `current`, or b) copying them on update, prior to replacing a value in `current`. Neither of these strategies are ideal. For example, with non-VE use-cases, copy-on-read is problematic due to the write-amplification it creates (think: HTML dumps). Additionally, in order to fulfill the VE contract, the copy //must// be done in-line to ensure the values are there for the forthcoming save, introducing additional transaction complexity, and latency. Copy-on-update over-commits by default, copying from `current` for every new render, regardless of the probability it will be edited, but happens asynchronously without impacting user requests, and can be done reliably. This proposal uses the //copy-on-update// approach. See https://www.mediawiki.org/wiki/RESTBase/StorageDesign#Table-per-query and [this document](https://docs.google.com/document/d/1Tvk1hZAiGiyk7881-wp7eAD0tiRLRbd3nzoucod3rgA/edit#heading=h.3zn4dovapypj) for details. ____ ## See also - {T156209}

As a sub-task of T120171, this task discusses steps towards storing current revisions only, in a reliable, low-maintenance, and low-latency manner. ## Option 1: Retention policies using application-level TTLs {icon star spin color=blue} This approach uses a schema identical to that of the current storage model, one that utilizes so-called wide rows to model a one-to-many relationship between a title and its revisions, and a one-to-many relationship between each revision and its corresponding renders. It differs only in how it approaches retention. Since renders are keyed on a type-1 UUID, retaining a single current render, and (at least) 24 hours worth of past renders, is as simple as batching a range delete with new renders, using a `tid` predicate 24 hours less than the one being inserted. Limiting renders is slightly more challenging since the revision is an integer and no temporal context exists. As a result, additional storage is used to establish this relationship, mapping timestamps to corresponding revisions. Records in this timeline are keyed by domain (on the assumption that mediawiki sharding would never be more granular than this). Updates to the timeline can be performed probabilistically, if necessary. TTLs can be applied to prevent unbounded growth. See https://www.mediawiki.org/wiki/RESTBase/StorageDesign#Retention_policies_using_application-level_TTLs for a more thorough explanation. ## Option 2: Separate TTL table with ~~rage~~ range deletes on latest content This option is very similar to the previous one, but it avoids the need of the timeline indexing. This approach uses 2 tables with schemas identical to the one used now. The `latest` table with no TTL for the data and the `ttl` table with the table-level TTL of 24 hours. On read, the data is first attempted to be read from he `latest` table, fallback to the `ttl` table with a fallback to generating on demand. If the data was generated on demand - it's written to both latest and `ttl` table. If the data was found in the `ttl` table only, the TTL refreshing might be applied if it's about to expire. (the TTL refreshing simply writes the data in the TTL table again, potentially with a new TID) On update, the following algorithm is applied: - Read the currently latest render form the latest table - Write it to the TTL table - Write a new latest render to the `latest` table - Write the new latest render to the TTL table (to avoid a race condition when 2 concurrent updates read a single render, and then both written new renders one of the new renders would never be stashed in the TTL table) - Apply range deletes for previous renders of the currently written revision and for the previous revisions (if the latest_hash revision policy is used) No explicit deletes are made in the TTL table, the unbounded growth is prevented by the table-level TTL. Open questions: - How would the TTL table behave in this - it receives as many writes as the latest table and the writes might be out of order (for the arbitrary old revision edit case) ## Option 3: Table-per-query This approach materializes views of results using distinct tables, each corresponding to a query. ### Queries / Tables - The most current render of the most current revision (table: `current`) - The most current render of a specific revision (table: `by_rev`) - A specific render of a specific revision (table: `by_tid`) ### Algorithm Data in the `current` table must be durable, but the contents of `by_rev` and `by_tid` can be ephemeral (should be, to prevent unbounded growth), lasting only for a time-to-live after the corresponding value in `current` has been superseded by something more recent. There are two ways of accomplishing this, either by a) copying the values on a read from `current`, or b) copying them on update, prior to replacing a value in `current`. Neither of these strategies are ideal. For example, with non-VE use-cases, copy-on-read is problematic due to the write-amplification it creates (think: HTML dumps). Additionally, in order to fulfill the VE contract, the copy //must// be done in-line to ensure the values are there for the forthcoming save, introducing additional transaction complexity, and latency. Copy-on-update over-commits by default, copying from `current` for every new render, regardless of the probability it will be edited, but happens asynchronously without impacting user requests, and can be done reliably. This proposal uses the //copy-on-update// approach. See https://www.mediawiki.org/wiki/RESTBase/StorageDesign#Table-per-query and [this document](https://docs.google.com/document/d/1Tvk1hZAiGiyk7881-wp7eAD0tiRLRbd3nzoucod3rgA/edit#heading=h.3zn4dovapypj) for details. ____ ## See also - {T156209}