We need our golden data retriever codebase to be more modular and have better support for addition of new modules, which will be very helpful when we start doing ZRR calculation for "well-behaved searches" (see T150370 & T150901). Testing if a new script works and backfilling missing data is a huge pain right now.
After talking with Analytics, Chelsy and I decided to migrate our codebase to use their Reportupdater infrastructure as it seems to meet our needs. This will require the following steps:
- Rewrite as many EventLogging (EL) based scripts to be pure SQL
- Rewrite current pure-R scripts be shell scripts + R and use Reportupdater conventions
- Update column names in current datasets
- Finalize (test the heck out of) Reportupdater-based codebase
- Code review by @chelsyx
- Deploy & schedule for daily execution
- Prepare dashboards for new formats/naming conventions (all in CR)
- Deploy dashboards after Reportupdater-based refactor of golden has completed at least one successful run