Page MenuHomePhabricator

Dockerize Hadoop Cluster, Druid, and Samza + Load Test
Closed, ResolvedPublic

Description

To facilitate building our Pageview API:

  • dockerize Druid, Hive, Samza, and anything else we need to simulate our production Hadoop cluster
  • pit Druid, Hive, and Samza against each other for pageview data processing
    • memory
    • storage
    • speed
  • (optional) build a dashboard on top of this

Thoughts: set up a physical network to reduce load on the WiFi?

Event Timeline

ggellerman raised the priority of this task from to Needs Triage.
ggellerman updated the task description. (Show Details)
ggellerman added a project: Analytics-Backlog.
ggellerman added a subscriber: ggellerman.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 18 2015, 5:28 PM
kevinator triaged this task as Normal priority.Jun 22 2015, 5:42 PM
kevinator updated the task description. (Show Details)
kevinator set Security to None.
kevinator added a subscriber: Rfarrand.
Milimetric raised the priority of this task from Normal to High.Jun 29 2015, 5:52 PM
Milimetric moved this task from Incoming to Blocked on the Analytics-Backlog board.
Milimetric moved this task from Blocked to Prioritized on the Analytics-Backlog board.
Milimetric renamed this task from list of tasks to present to volunteers at wikimania to Dockerize Hadoop Cluster, Druid, and Samza + Load Test.Jul 6 2015, 6:00 PM
Milimetric updated the task description. (Show Details)
Qgil added a subscriber: Qgil.Jul 7 2015, 2:12 PM

Who is the owner of this #Wikimania-Hackathon-2015 project?

@Qgil the whole analytics team is the owner. We'll all be there, we'll all work on this, and we'll be in sync:

Qgil assigned this task to Milimetric.Jul 8 2015, 6:07 AM

OK, thank you. We are just aiming to have all the confirmed sessions assigned to someone, in order to make it easier for anybody to contact you if they have questions.

Oh I'm happy to be the point of contact, but if anyone is reading this, grab anyone on the list above if you can't find me.

We are at table 18
Table Name: Pageviews, Big Data, Analytics

What is the status of this task, now that Wikimania 2015 is over? Did this hacking project take place and was successfully finished? If yes: Please provide an update and potentially summarize findings / provide a link to anything relevant (and if the task is not completely finished yet, please move the project to the "Work continues after Mexico City" column on the #Wikimania-Hackathon-2015 workboard). If no: Please edit this task by removing the #Wikimania-Hackathon-2015 project from this task. Thanks for your help and keeping this task updated!

Milimetric moved this task from Next Up to Done on the Analytics-Kanban board.

We did not dockerize everything we wanted to because it turned out Docker was not as great of an ecosystem as we guessed going into the hackathon. We did finish the intent behind this task which was to closely analyze storage and processing technologies for the purpose of standing up our Pageview API

kevinator closed this task as Resolved.Jul 29 2015, 2:42 PM