In order to avoid dirty diffs on visual editing, RESTBase needs to make sure that data-parsoid corresponding to the exact render of Parsoid HTML the edit is based on is still present in the storage when the edit is completed. Failing to do so results in broken edits or dirty diffs. In order to achieve that RESTBase employs a very complicated storage semantics when all of the parsoid content is pre-rendered, but the superseded renders of the content are guaranteed to be preserved for a grace_ttl period of time. This code is very complicated and creates a lot of IO overhead since special indexes have to be maintained for renders and revisions.
In order to simplify the system, I propose to change the approach and stash 'data-parsoid' and original HTML in a temporary stashing table with a TTL. This will create 2 additional writes on a read of the HTML, so this should not be done universally. Instead, VisualEditor should supply an additional query parameter indicating that the transform endpoints will be used later on the HTML. Additionally, the HTML provided to VE must not be cached in Varnish.
This task is created to discuss this alternative approach and collect data on how important Varnish caching is for VE use case (how much would this slow it down), how much of an IO win/lose are we looking at, how significant the slowdown from 2 additional writes will be and decide whether to pursue this path.