We'd like to upgrade wmf-eventutilities to support flink 1.20 so that existing jobs can start upgrading to this version and new jobs can be created using this version.
AC:
- a version of wmf-eventutilities compatible with flink 1.20 is available
We'd like to upgrade wmf-eventutilities to support flink 1.20 so that existing jobs can start upgrading to this version and new jobs can be created using this version.
AC:
| Subject | Repo | Branch | Lines +/- | |
|---|---|---|---|---|
| eventutilities-flink: version bump flink to 1.20 | wikimedia-event-utilities | master | +98 -43 |
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Open | None | T376812 EPIC: Update flink jobs to support Flink 1.20 | |||
| Resolved | gmodena | T377130 Bump eventutilities to support flink 1.20 |
Current attempt by @gmodena uploaded at https://gerrit.wikimedia.org/r/c/wikimedia-event-utilities/+/1079506
But failing with
[ERROR] Errors: [ERROR] TestEventRowSerializer.testSchemaCompatibility:100 » NoSuchMethod org.apache.flink.api.common.typeutils.CompositeTypeSerializerUtil.constructIntermediateCompatibilityResult([Lorg/apache/flink/api/common/typeutils/TypeSerializer;[Lorg/apache/flink/api/common/typeutils/TypeSerializerSnapshot;)Lorg/apache/flink/api/common/typeutils/CompositeTypeSerializerUtil$IntermediateCompatibilityResult; [ERROR] TestJsonRowDeserializationSchema.testJsonParse(TestSpec) » NoSuchMethod org.junit.platform.commons.util.ClassLoaderUtils.getClassLoader(Ljava/lang/Class;)Ljava/lang/ClassLoader; [INFO] [ERROR] Tests run: 95, Failures: 0, Errors: 2, Skipped: 1
Change #1079506 had a related patch set uploaded (by Gmodena; author: Gmodena):
[wikimedia-event-utilities@master] WIP: flink version bump.
I've got wikimedia-event-utilities to build with Flink 1.19. I want to do a couple more checks before marking the CR ready to review, but at least all tests are passing.
But failing with
[ERROR] Errors: [ERROR] TestEventRowSerializer.testSchemaCompatibility:100 » NoSuchMethod o
I missed an API change in EventRowSerializer.
[ERROR] TestJsonRowDeserializationSchema.testJsonParse(TestSpec) » NoSuchMethod org.junit.platform.commons.util.ClassLoaderUtils.getClassLoader(Ljava/lang/Class;)Ljava/lang/ClassLoader;
This was just a runtime error due to conflicting (transient) junit deps. Fixed by version pinning junit-jupiter-params.
@dcausse flink-kafka-connector 3.3.0 just released, with support to Flink 1.20. See release notes.
I've updated (not yet pushed) the linked CR to address method migration from the serializer snapshots that existed before Flink 1.19, as you suggested.
I'll bump Flink deps to 1.20, if that's ok with you.
I've updated (not yet pushed) the linked CR to address method migration from the serializer snapshots that existed before Flink 1.19, as you suggested.
I've spent some time reading up
https://cwiki.apache.org/confluence/display/FLINK/FLIP-398%3A+Improve+Serialization+Configuration+And+Usage+In+Flink
Another change we should to address, but maybe in a different CR, is the deprecation of
org.apache.flink.api.common.typeinfo#createSerializer(ExecutionConfig config)
in favor of
org.apache.flink.api.common.typeinfo#createSerializer(SerializerConfig config)
While the change seems small, I don't fully understand (yet) the implications for EventRowTypeInfo.
Good catch! I missed that one and our tests suite as well... I think we should address it in this CR because it might break things.
I think we want to override org.apache.flink.api.common.typeinfo#createSerializer(SerializerConfig config) (just changing the ExecutionConfig into SerializerConfig should be enough).
The thing is that the parent class RowTypeInfo now overrides createSerializer(SerializerConfig config) so if this one is called (and it's likely to be the one called from flink core components we might end-up using the RowTypeInfo behaviors instead of EventRowTypeInfo)
Regarding back-compatibility calling createSerializer(ExecutionConfig config) should fallback to the parent RowTypeInfo#createSerializer(ExecutionConfig config) which is then properly calling createSerializer(SerializerConfig config) so we should be good here.
We can definitely test both behaviors in org.wikimedia.eventutilities.flink.TestEventRowSerializer#testRowSerializer calling both createSerializer(ExecutionConfig config) & createSerializer(SerializerConfig config) on EventRowTypeInfo and assert that both methods return an EventRowSerializer:
TypeSerializer<Row> serializer = rowTypeInfo.createSerializer(new ExecutionConfig()); assertThat(serializer).isInstanceOf(EventRowSerializer.class); serializer = rowTypeInfo.createSerializer(new ExecutionConfig().getSerializerConfig()); assertThat(serializer).isInstanceOf(EventRowSerializer.class);
Many thanks for the analysis and pointers! I'll address this API change in the version bump CR.
The thing is that the parent class RowTypeInfo now overrides createSerializer(SerializerConfig config) so if this one is called (and it's likely to be the one called from flink core components we might end-up using the RowTypeInfo behaviors instead of EventRowTypeInfo)
Yep. I am able to replicate this behavior, and make existing tests fail.
Change #1079506 merged by jenkins-bot:
[wikimedia-event-utilities@master] eventutilities-flink: version bump flink to 1.20