Processing page_change with v1.23.0 is resulting in deserialization errors in the eqiad/codfw deployment that are preventing the application from starting.
The same deployment started up fine in staging, but no data was processed (kafka topics are empty) and the issue was not detected.
This was the cause of the HA restore issue reported in https://phabricator.wikimedia.org/T340059#8988977.
Workaround: rolling back to v1.21.0 fixed the issue.
Error
Failed to deserialize consumer record due to","error.stack_trace":"java.io.IOException: Failed to deserialize consumer record due to\n\tat org.apache.flink.connector.kafka.source.reader.KafkaRecordEmitter.emitRecord(KafkaRecordEmitter.java:56)\n\tat org.apache.flink.connector.kafka.source.reader.KafkaRecordEmitter.emitRecord(KafkaRecordEmitter.java:33)\n\tat org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:144)\n\tat org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:417)\n\tat org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68)\n\tat org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)\n\tat org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:550)\n\tat org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)\n\tat org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)\n\tat org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788)\n\tat org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)\n\tat org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931)\n\tat org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745)\n\tat org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)\n\tat java.base/java.lang.Thread.run(Thread.java:829)\nCaused by: org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: Could not forward element to next operator\n\tat org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:92)\n\tat org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:50)
Impact
- critical, application won't start.
Notes
- rolling back to v1.21.0 fixed the issue