Page MenuHomePhabricator

Develop a node kafka driver based on librdkafka
Closed, ResolvedPublic

Description

Experience in running ChangeProp in production showed, that kafka-node driver doesn't meet stability or performance requirements. Alternative existing driver, kafka-native doesn't support kafka 0.9 and doesn't seem to be actively maintained. So, after a lot of consideration it's been decided to try building our own kafka driver for node.js. A very early prototype could be found here.

Currently it's a wrapper over librdkafka C++ API, that supports asynchronous message consuming and fire-and-forget production of the messages. The effort is still in a very early stage, but performance already seem promising, on my laptop I could reach consumption rate of ~100.000 msg/s.

Obviously, this is still very early to judge about the stability of this driver, and a lot of work still needs to be done. Here's a brief outline of the plan:

  • Support providing the configuration
  • Support fire-and-forget asynchronous commits

After that's done, it would be possible to reimplement ChangeProp on top of the new driver and test whether it fits our needs. If we find the approach good enough, some of the next steps would be:

  • Support callbacks on production
  • Support callbacks on offset commits
  • Properly handle all the error conditions
  • Review locking and thread-safety, optimise for performance, write tests etc.

I still cannot estimate the scope of work needed to be done or write any timeline on this, because it's still not clear how CP would perform on this new driver

Event Timeline

GWicke renamed this task from Develop a node kafka driver to Develop a node kafka driver based on librdkafka.Jul 13 2016, 6:25 PM

Hm, btw!

https://github.com/apache/kafka/pull/1678/files

I'm pretty sure the librdkafka @Pchelolo is building on already supports 0.10, so if we get change-prop running on this, we should be cleared for eventual upgrade to 0.10 on the main Kafka clusters. analytics Kafka will be more difficult, since there are many varying clients that consume from it.

After all we've decided to go with the driver developed and open-sourced by Blizzard: https://github.com/Blizzard/node-rdkafka Currently we're running our own fork in production with a little modification to avoid a bug in librdkafka, but hopefully it will be fixed and we will be able to switch to upstream. Resolving this ticket.