- Parametrize job with a whitelist instead of a blacklist
- Modify puppet profile and jobs accordingly
- Write a comprehensive documentation
- Remove the confusing Count metric from the datasources in Turnilo, or at least uncheck it by default.
- Try to add a new metric to the datasource, eventCountPercentage, that normalizes eventCount splits by the total aggregate, so that time measure buckets become percentage-of-total values, instead of frequencies. This way they will not vary with throughput changes or seasonality, and will be a lot easier to follow.
Description
Details
Related Objects
Event Timeline
Change 465532 had a related patch set uploaded (by Mforns; owner: Mforns):
[analytics/refinery/source@master] Refactor EventLoggingToDruid to use whitelists and ConfigHelper
Change 465692 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/puppet@production] Add druid_load jobs to analytics refinery
Change 465532 merged by jenkins-bot:
[analytics/refinery/source@master] Refactor EventLoggingToDruid to use whitelists and ConfigHelper
Change 467422 had a related patch set uploaded (by Mforns; owner: Mforns):
[analytics/refinery/source@master] Rename start_date and end_date to since until in EventLoggingToDruid.scala
Change 467422 merged by Ottomata:
[analytics/refinery/source@master] Rename start_date and end_date to since until in EventLoggingToDruid.scala
Change 465692 merged by Ottomata:
[operations/puppet@production] Add druid_load jobs to analytics refinery
Change 468374 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/puppet@production] Fix broken default job_class for eventlogging_to_druid_job.pp
Change 468374 merged by Ottomata:
[operations/puppet@production] Fix broken default job_class for eventlogging_to_druid_job.pp
Change 468550 had a related patch set uploaded (by Mforns; owner: Mforns):
[analytics/refinery/source@master] Fix bug in EventLoggingToDruid, add time measures as dimensions
Change 468588 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/puppet@production] Fine tune eventlogging_to_druid_job spark and druid parameters
Change 468550 merged by jenkins-bot:
[analytics/refinery/source@master] Fix bug in EventLoggingToDruid, add time measures as dimensions
Change 468588 merged by Ottomata:
[operations/puppet@production] Fine tune eventlogging_to_druid_job spark and druid parameters
The docs are here: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Hive_to_Druid
Moving to done!