Page MenuHomePhabricator

Finalize eventlogging to druid ingestion
Closed, ResolvedPublic13 Estimated Story Points

Description

  • Parametrize job with a whitelist instead of a blacklist
  • Modify puppet profile and jobs accordingly
  • Write a comprehensive documentation
  • Remove the confusing Count metric from the datasources in Turnilo, or at least uncheck it by default.
  • Try to add a new metric to the datasource, eventCountPercentage, that normalizes eventCount splits by the total aggregate, so that time measure buckets become percentage-of-total values, instead of frequencies. This way they will not vary with throughput changes or seasonality, and will be a lot easier to follow.

Event Timeline

mforns triaged this task as Medium priority.Oct 8 2018, 3:36 PM
mforns moved this task from Incoming to Smart Tools for Better Data on the Analytics board.

Change 465532 had a related patch set uploaded (by Mforns; owner: Mforns):
[analytics/refinery/source@master] Refactor EventLoggingToDruid to use whitelists and ConfigHelper

https://gerrit.wikimedia.org/r/465532

Change 465692 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/puppet@production] Add druid_load jobs to analytics refinery

https://gerrit.wikimedia.org/r/465692

Change 465532 merged by jenkins-bot:
[analytics/refinery/source@master] Refactor EventLoggingToDruid to use whitelists and ConfigHelper

https://gerrit.wikimedia.org/r/465532

Change 467422 had a related patch set uploaded (by Mforns; owner: Mforns):
[analytics/refinery/source@master] Rename start_date and end_date to since until in EventLoggingToDruid.scala

https://gerrit.wikimedia.org/r/467422

Change 467422 merged by Ottomata:
[analytics/refinery/source@master] Rename start_date and end_date to since until in EventLoggingToDruid.scala

https://gerrit.wikimedia.org/r/467422

Milimetric raised the priority of this task from Medium to High.Oct 18 2018, 5:17 PM

Change 465692 merged by Ottomata:
[operations/puppet@production] Add druid_load jobs to analytics refinery

https://gerrit.wikimedia.org/r/465692

Change 468374 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/puppet@production] Fix broken default job_class for eventlogging_to_druid_job.pp

https://gerrit.wikimedia.org/r/468374

Change 468374 merged by Ottomata:
[operations/puppet@production] Fix broken default job_class for eventlogging_to_druid_job.pp

https://gerrit.wikimedia.org/r/468374

Change 468550 had a related patch set uploaded (by Mforns; owner: Mforns):
[analytics/refinery/source@master] Fix bug in EventLoggingToDruid, add time measures as dimensions

https://gerrit.wikimedia.org/r/468550

Change 468588 had a related patch set uploaded (by Mforns; owner: Mforns):
[operations/puppet@production] Fine tune eventlogging_to_druid_job spark and druid parameters

https://gerrit.wikimedia.org/r/468588

Change 468550 merged by jenkins-bot:
[analytics/refinery/source@master] Fix bug in EventLoggingToDruid, add time measures as dimensions

https://gerrit.wikimedia.org/r/468550

Change 468588 merged by Ottomata:
[operations/puppet@production] Fine tune eventlogging_to_druid_job spark and druid parameters

https://gerrit.wikimedia.org/r/468588

mforns renamed this task from Parametize eventlogging to druid ingestion with a whitelist instead of a blacklist to Finalize eventlogging to druid ingestion with a whitelist instead of a blacklist.Oct 23 2018, 2:00 PM
mforns updated the task description. (Show Details)
mforns set the point value for this task to 13.
mforns renamed this task from Finalize eventlogging to druid ingestion with a whitelist instead of a blacklist to Finalize eventlogging to druid ingestion.Oct 23 2018, 2:02 PM
mforns moved this task from In Code Review to In Progress on the Analytics-Kanban board.