Page MenuHomePhabricator

Prototype and test range delete-based current revision storage
Closed, ResolvedPublic

Description

Of the alternatives discussed for managing retention of current revision storage, the use of range deletes seems most promising. This issue will track the prototyping and testing of this design.

Outline

  • Deploy change-propagation to the dev environment, configured to run in test_mode, and with sampling enabled
  • Implement a time-line storage endpoint in RESTBase using a key-value table
  • Configure change-propagation to send updates to the time-line for new revisions
  • Implement alternative RESTBase retention policy incorporating range deletes
  • Collect data, test, etc (in-progress: grafana, kibana (restbase), kibana (cassandra))

Event Timeline

Eevans updated the task description. (Show Details)

Ok, Change Propagation is alive and kicking on restbase-dev1003 with a 5% sampling. It is running as a SystemD service under my user (not ideal, but better than running as root).

The time-line table has been set up in the Dev Cluster, and restbase-dev1003 has been configured to accept POST requests on an internal domain. Change Propagation on the same node sends it updates on every new revision.

Mentioned in SAL (#wikimedia-operations) [2017-05-16T20:10:47Z] <mobrovac@tin> Started restart [restbase/deploy@d98af6f] (dev-cluster): Apply the revision range delition algorithm - T164865

Mentioned in SAL (#wikimedia-operations) [2017-05-16T20:22:37Z] <mobrovac@tin> Started restart [restbase/deploy@d98af6f] (dev-cluster): Apply the revision range deletion algorithm, take 2 - T164865

Eevans updated the task description. (Show Details)

Current Status:

  • An instance of change-prop is running on restbase-dev1003 in test_mode, with 15% sampling
  • An instance of change-prop is running on restbase-test2003 in test_mode, sampling at 15%, (sends to restbase-dev100x, processes re-renders)
  • An experimental branch of RESTBase has been deployed to the dev env
    • Adds support for a RESTBase k/v-based time-line store (which change-prop is updating)
    • Changes Parsoid tables to use latest_hash retention policy
    • Uses an experimental branch of restbase-mod-table-cassandra that implements a 1 hour TTL on revisions using range deletes for latest_hash

Next-up: Experimental support for range delete-based retention of renders

Mentioned in SAL (#wikimedia-operations) [2017-05-17T18:54:34Z] <urandom> T164865: restarting RESTBase in dev env to apply range-delete probability bug-fix

Mentioned in SAL (#wikimedia-operations) [2017-05-17T22:00:10Z] <urandom> T164865: altering compaction strategy to sizetiered, local_group_wikipedia_T_parsoid_html.data (in RESTBase dev)

Mentioned in SAL (#wikimedia-operations) [2017-05-18T19:06:31Z] <urandom> T164865: configure RESTBase tables for size-tiered compaction (dev env only)

Mentioned in SAL (#wikimedia-operations) [2017-05-18T19:52:38Z] <urandom> T164865: restarting RESTBase-dev to apply range delete-based render retention

Mentioned in SAL (#wikimedia-operations) [2017-05-18T20:03:51Z] <urandom> T164865: restarting RESTBase-dev, range delete-based render retention

Mentioned in SAL (#wikimedia-operations) [2017-05-24T18:00:41Z] <urandom> T164865: Upgrading Cassandra from 3.7.3-instaclustr to 3.10

Mentioned in SAL (#wikimedia-operations) [2017-05-24T21:53:35Z] <urandom> T164865: Disabling range delete-based render culling, dev env

Mentioned in SAL (#wikimedia-operations) [2017-05-25T16:27:11Z] <urandom> T164865: RESTBase dev, re-enable revision range deletes

Mentioned in SAL (#wikimedia-operations) [2017-05-25T18:27:10Z] <urandom> T164865: RESTBase dev, re-enable render range deletes

Mentioned in SAL (#wikimedia-operations) [2017-05-25T20:30:16Z] <urandom> T164865: RESTBase dev, disable revision range deletes

Mentioned in SAL (#wikimedia-operations) [2017-06-13T14:10:52Z] <urandom> T164865: Restart RESTBase dev; apply range delete probability of 1.0

@Eevans I think this can be resolved given that we've put our fish in the range-delete basket?

Eevans updated the task description. (Show Details)

@Eevans I think this can be resolved given that we've put our fish in the range-delete basket?

Indeed.