Page MenuHomePhabricator

Avoid accepting Kafka messages with whacky timestamps
Open, MediumPublic

Description

In https://phabricator.wikimedia.org/T250133#6063641 we encountered an error where a bad kafka timestamp caused kafka log rolling to stop indefinitely, which filled up disks.

Having a bad Kafka timestamp (way out of range, e.g. years in the future or past) will also hurt stream processing and Hive partition ingestion.

We could configure Kafka to reject messages with timestamps that are too old or two far in the future with log.message.timestamp.difference.max.ms. Setting this to the value of log.retention.ms seems to make the most sense, but this caused issues with compacted topics as noted here. Kafka had log.retention.ms as the default value for log.message.timestamp.difference.max.ms for a few versions but this was reverted to due complexities with compacted topics.

This really only matters when the data produced is untrusted. eventgate-analytics-external and eventgate-logging-external accept events from external producers. Our code does the right thing, but there is nothing stopping someone from manually POSTing an event with a whacky meta.dt, which will be used for the Kafka timestamp. After we do T267648: Adopt conventions for server receive and client/event timestamps in non analytics event schemas, we should probably modify EventGate so that it always sets meta.dt itself, rather than accepting the producer's value if it is present.

This would help mitigate the potential problem, but it doesn't stop bugs in our code from emitting bad timestamps. Setting log.message.timestamp.difference.max.ms would, but I'm not sure what to do if we start using compacted topics.

Event Timeline

odimitrijevic moved this task from Incoming to Operational Excellence on the Analytics board.

I'd say this is medium to low priority and is something that needs to be worked on in collaboration with maintainers of other Kafka clusters.

Milimetric lowered the priority of this task from High to Medium.May 17 2021, 9:19 PM