Page MenuHomePhabricator

mw-page-content-change-enrich: filter out events larger than max.request.size
Closed, ResolvedPublic

Description

When enriched event results in payloads larger than max.request.size, Flink's Kafka Sink producer will fail with org.apache.kafka.common.errors.RecordTooLargeException.

This will result in the message getting lost and the Taskmanager shutting down, and potentially failing to restart.

To improve application reliability, we should filter out these messages before producing them and forward the origin event to the application's error topic.

Related work:

Details

TitleReferenceAuthorSource BranchDest Branch
Don't process events with large content body.repos/data-engineering/mediawiki-event-enrichment!69gmodenafilter-large-recordsmain
Customize query in GitLab

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
gmodena renamed this task from mw-page-content-change-enrich: events larger than max.request.size should be produced to mw-page-content-change-enrich: filter out events larger than max.request.size.Jul 20 2023, 9:29 PM