As a product manager, I want to know a source of truth for all current candidates for data deletion from Blazegraph in the case of catastrophic failure, so that I can prioritize what is deleted.
So far we've investigated several potentials for deleting Wikidata data from Blazegraph in the case of a catastrophic failure: i.e. lexemes, scholarly articles, labels, etc. These have been documented across various analysis write-ups, but it would be helpful to have them in a single table to be able to look at them all at once.
- create a table including all current deletion candidates we've looked into so far
- each candidate should include (approximations ok): number/% of entities, number/% of triples, number of days for Blazegraph to recover at current rate of growth, number/% of queries potentially affected
For reference, all the required information for scholarly articles is available at: https://wikitech.wikimedia.org/w/index.php?title=User:AKhatun/Wikidata_Scholarly_Articles_Subgraph_Analysis#TL;DR . This ticket is to tabulate all this information into a single table that includes the same information for all other data deletion candidates we've investigated.