Page MenuHomePhabricator

Respawn the schema/field white-list for EL auto-purging {tick}
Closed, ResolvedPublic13 Story Points

Description

We need a white-list of the schema-field pairs that can be kept indefinitely in EL database.
Those must be extracted from the purging strategies that were agreed at the time of EL's audit.
This white-list will be used to implement the corresponding auto-purging system.

When the audit was done, a white-list was created, but since the implementation of the purging was deprioritized, we need to update this list, because several schemas that we agreed to partially purge or no purge have new fields that must be included.

Also, new schema owners must be notified that their new schemas will loose data after the auto-purging starts. And that if they want to discuss other purging strategies for their schemas, it should be done now. So that the fields can be included in the white-list.

Event Timeline

mforns created this task.May 12 2016, 11:02 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptMay 12 2016, 11:02 PM
mforns set the point value for this task to 13.May 12 2016, 11:05 PM
mforns moved this task from Next Up to In Progress on the Analytics-Kanban board.May 12 2016, 11:11 PM
mforns moved this task from In Progress to Paused on the Analytics-Kanban board.Jun 28 2016, 4:25 PM
mforns moved this task from Paused to In Progress on the Analytics-Kanban board.Jul 11 2016, 3:19 PM
mforns added a comment.EditedJul 12 2016, 3:51 PM

The new white-list including the new data for the modified schemas since Aug 2015 is done. Here's the file:

How to understand that list:

  1. It has 2 columns: schema and field.
  2. It is a white list, meaning that all fields listed are to be kept indefinitely (they will survive auto-purging). This also ensures that all new schemas or fields will be auto-purged by default.
  3. The schema column does not include the revision. This is done on purpose, so that new versions of a schema won't need to re-add all previously existing fields to the white list.
mforns moved this task from In Progress to Done on the Analytics-Kanban board.Jul 12 2016, 3:58 PM

Note, I forgot to add the editCountBucket fields that were the result of the bucketization.
Here is the white list that includes those fields, the previous one is incorrect and should not be used.

Nuria closed this task as Resolved.Jul 27 2016, 4:04 PM