Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | None | T130258 Prototype Data Pipeline on Druid | |||
Resolved | Ottomata | T131974 Puppetize druid |
Event Timeline
This can be done even if the absence of new hardware. How is the documentation?
This would take more or less time depending on the amount of processes that we have to set up and daemons that are running.
For example:
- Will need to integrate with HDFS,
- We will need to load data from druid into cassandra so it needs to talk to cassandra too. Is this just a query job? Or daemon configuration?
Do we need the hdfs labs cluster up to test this integration.
Change 291043 had a related patch set uploaded (by Ottomata):
Include cloudera reprepro updates in jessie-wikimedia
Change 291043 merged by Ottomata:
Include cloudera reprepro updates in jessie-wikimedia
Change 291113 had a related patch set uploaded (by Ottomata):
Apply druid roles in production with initial (guesswork) configuration
Change 291113 merged by Ottomata:
Apply druid roles in production with initial (guesswork) configuration
Change 291128 had a related patch set uploaded (by Ottomata):
Add analytics_cluster::hadoop::client to druid workers so CDH is installed
Change 291128 merged by Ottomata:
Add analytics_cluster::hadoop::client to druid workers so CDH is installed
Change 291129 had a related patch set uploaded (by Ottomata):
Install the druid service package for each service
Change 291137 had a related patch set uploaded (by Ottomata):
Use ruby json lib to render Arrays as strings in druid runtime.properties.erb
Change 291137 merged by Ottomata:
Use ruby json lib to render Arrays as strings in druid runtime.properties.erb
Change 291140 had a related patch set uploaded (by Ottomata):
Druid puppet improvements for prod
I'm calling this done!
There are going to be a lot of smaller follow up tasks, especially for monitoring.