Page MenuHomePhabricator

Review mjolnir kafka implementation on large featuresets
Closed, ResolvedPublic

Description

Large message causes our kafka implementation to fail in various places:

  • on the daemon, because python-kafka applies its message size prior to message compression
    • Increase python-kafka max request size to 40MB because our messages compress very well.
  • on the spark workers using KafkaUtil.createRDD: Ran out of messages before reaching ending offset 182100 for topic mjolnir_result partition 0 start 152128. This should not happen, and indicates that messages may have been lost.
    • it's probably that we need to tweak by adding some config params to KafkaUtils.createRDD.

Event Timeline

dcausse created this task.Oct 16 2017, 9:36 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 16 2017, 9:36 AM
dcausse triaged this task as Normal priority.Oct 16 2017, 9:37 AM
dcausse moved this task from needs triage to Current work on the Discovery-Search board.

Change 385391 had a related patch set uploaded (by DCausse; owner: DCausse):
[search/MjoLniR@master] More kafka tweaks

https://gerrit.wikimedia.org/r/385391

Change 385391 merged by EBernhardson:
[search/MjoLniR@master] More kafka tweaks

https://gerrit.wikimedia.org/r/385391

debt closed this task as Resolved.Oct 26 2017, 3:49 PM