Page MenuHomePhabricator

Write a new Camus consumer and store the data for Event Logging {stag} [21 pts]
Closed, ResolvedPublic

Description

  1. Valid Events --> timestamp, schema+Rev, source? (client / server), event_data_json
  2. Invalid Events --> timestamp, schema+Rev, source? (client / server), event_data_raw, validation_error, event_data_json (nullable)

Joal to pair with Madhu and work with Ottomata as well

Python & Kafka running in Beta Labs

Event Timeline

ggellerman assigned this task to JAllemandou.
ggellerman raised the priority of this task from to Needs Triage.
ggellerman updated the task description. (Show Details)
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 11 2015, 9:22 PM
ggellerman updated the task description. (Show Details)May 11 2015, 9:22 PM
ggellerman set Security to None.
Milimetric renamed this task from Code to write a new Camus consumer and store the data in two Hive tables [21 points] to Code to write a new Camus consumer and store the data in two Hive tables [21 points] {oryx}.May 19 2015, 4:19 PM
kevinator renamed this task from Code to write a new Camus consumer and store the data in two Hive tables [21 points] {oryx} to Code to write a new Camus consumer and store the data in two Hive tables [21 pts] {oryx}.May 26 2015, 11:16 PM
kevinator moved this task from Next Up to Tasked_Hidden on the Analytics-Kanban board.

As is, I'm not sure how this data could be mapped to a Hive Table. We plan to currently only have one new Kafka topic for valid eventlogging data, which means that all Camus imported data will go to a single directory. However, the schemas in the valid eventlogging data are variable, and thus cannot have a Hive table mapped to it.

The imported data will still be useable by other frameworks, like Spark, but not ones that requires static schemas like Hive or Impala.

kevinator triaged this task as Medium priority.Jun 2 2015, 10:06 PM
Milimetric renamed this task from Code to write a new Camus consumer and store the data in two Hive tables [21 pts] {oryx} to Code to write a new Camus consumer and store the data in two Hive tables {oryx} [21 pts].Jun 3 2015, 8:57 PM
kevinator renamed this task from Code to write a new Camus consumer and store the data in two Hive tables {oryx} [21 pts] to Code to write a new Camus consumer and store the data in two Hive tables {stag} [21 pts].Jun 8 2015, 10:25 PM
kevinator renamed this task from Code to write a new Camus consumer and store the data in two Hive tables {stag} [21 pts] to Code to write a new Camus consumer and store the data {stag} [21 pts].Jun 8 2015, 11:02 PM
ggellerman moved this task from Incoming to Tasked on the Analytics-Backlog board.
JAllemandou removed JAllemandou as the assignee of this task.Jun 15 2015, 5:39 PM
JAllemandou renamed this task from Code to write a new Camus consumer and store the data {stag} [21 pts] to Write a new Camus consumer and store the data for Event Logging {stag} [21 pts].Jun 19 2015, 4:22 PM
Ottomata closed this task as Resolved.Jun 23 2015, 3:41 PM
Ottomata claimed this task.
Ottomata added a project: Analytics-Kanban.

Done!

It doesn't make sense to import events by invalid/valid. Instead, they are imported by their schema names.