Page MenuHomePhabricator

Move Eventlogging Kafka writer to use pykafka's Producer instead of python-kafka {stag} [8 pts]
Closed, DeclinedPublic

Description

Since we rewrote the EventLogging kafka reader handler to use pykafka (for the BalancedConsumer), move the writer to the same library for consistency.

  • do a quick performance analysis to make sure there is no huge difference

Event Timeline

madhuvishy claimed this task.
madhuvishy raised the priority of this task from to Needs Triage.
madhuvishy updated the task description. (Show Details)
madhuvishy subscribed.
madhuvishy renamed this task from Move Eventlogging Kafka writer to use pykafka's Producer instead of python-kafk to Move Eventlogging Kafka writer to use pykafka's Producer instead of python-kafka.Aug 17 2015, 1:43 AM
madhuvishy set Security to None.
kevinator renamed this task from Move Eventlogging Kafka writer to use pykafka's Producer instead of python-kafka to Move Eventlogging Kafka writer to use pykafka's Producer instead of python-kafka {stag}.Aug 17 2015, 4:57 PM
kevinator updated the task description. (Show Details)
kevinator renamed this task from Move Eventlogging Kafka writer to use pykafka's Producer instead of python-kafka {stag} to Move Eventlogging Kafka writer to use pykafka's Producer instead of python-kafka {stag} [8 pts].Aug 17 2015, 5:01 PM
kevinator triaged this task as High priority.
kevinator moved this task from Incoming to Tasked on the Analytics-Backlog board.

Change 232408 had a related patch set uploaded (by Madhuvishy):
[WIP] Change kafka writer to use pykafka Producer

https://gerrit.wikimedia.org/r/232408

Looked into this a bit more with @Ottomata, and it looks like atm, python-kafka's producer is more robust that pykafka's for our purpose.

Reasons:

  1. Pykafka forces us to create instances of a topic and a producer for every message, these are not cached by the library, and it's hard to cache this on our end because it ties producers to a topic unlike python-kafka.
  2. Pykafka has a sync producer unlike python-kafka, which slows things down a bit - they have added an Async one in the last few days - but since it's fairly new, we don't want to jump to using it.

Change 232408 abandoned by Madhuvishy:
[WIP] Change kafka writer to use pykafka Producer

Reason:
We decided not to move the writer to pykafka

https://gerrit.wikimedia.org/r/232408