Page MenuHomePhabricator

Gather all data-purge into a single job
Open, HighPublic


Analytics data-purge is spread over multiple jobs with different running ways (1 timer per dataset, or multiple datasets through a single timer).
I suggest creating a script that would gather data to clean through configuration making it easier to maintain (1 single point of config for all data-purges). This script could be run at different time intervals (parameter of time-interval for the script to know which data to work), and would take advantage of the different purging strategies we already have solutions for (time-partition as in webrequest, snapshot, hive or not).

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 7 2020, 11:49 AM
Milimetric triaged this task as High priority.Sep 10 2020, 4:06 PM
Milimetric moved this task from Incoming to Operational Excellence on the Analytics board.