Page MenuHomePhabricator

Logstash dead letter queue feature does not monitor queue size
Closed, ResolvedPublic

Description

When the dead letter queue reaches the size defined by dead_letter_queue.max_bytes, additional dead letters directed to this queue will cause logstash to emit:

cannot write event to DLQ(path: /var/lib/logstash/dead_letter_queue/main): reached maxQueueSize of <N>

It is indicated to upstream that the dead letter queue feature still has some improvements to be made: https://github.com/elastic/logstash/issues/8795

The core of the issue is that currentQueueSize never checks the size of the queue on disk (except at startup), but is perpetually increased by the size of each message that gets placed on the queue: https://github.com/elastic/logstash/blob/v7.10.0/logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueWriter.java#L179

Event Timeline

Change 673377 had a related patch set uploaded (by Cwhite; owner: Cwhite):
[operations/puppet@production] logstash: add and enable dlq max_bytes workaround

https://gerrit.wikimedia.org/r/673377

colewhite moved this task from Inbox to In progress on the observability board.

Change 673377 merged by Cwhite:
[operations/puppet@production] logstash: add and enable dlq max_bytes workaround

https://gerrit.wikimedia.org/r/673377

colewhite claimed this task.

The DLQ max_bytes workaround script fired on March 29th and appears to have done the right thing.