Page MenuHomePhabricator

High Availability Flink
Open, Needs TriagePublic

Description

With the latest updated of Flink (1.12), High Availability is now supported for Flink running in Kubernetes. The documentation looks very straightforward.

AC
Upgrade our Flink setup to use HA mode

Event Timeline

The changes for HA are mostly changing a few lines of the config file. The part that is unclear is whether we can get a service account for pods that allow them to do CRUD operations for configmaps. Pods will need to be able to update the configmap. Details here. Another thing that would have to change is that JobManager pods should be started with their IP address instead of a Kubernetes service as its jobmanager.rpc.address. I'm not sure how that would work.

The alternative to updating configmaps is to run a separate zookeeper cluster, more on that here

I do see that using the configmap election method is appealing as it is build in and does not require additional software to function. Unfortunately I was not able to understand (by briefly reading the docs) if this uses a separate configmap or the one that is actually used for configuring flink.
While the former would be okay-ish I guess, the latter will potentially cause problems as every deployment will result in a re-creation of said configmap by helm. Resetting it to whatever state the chart has defined.
Apart from potentially losing data in that case I'm not 100% certain that helm will handle that properly in every case as I have seen to many weird issues with helm and "manually" altered kubernetes objects.

I do see that using the configmap election method is appealing as it is build in and does not require additional software to function. Unfortunately I was not able to understand (by briefly reading the docs) if this uses a separate configmap or the one that is actually used for configuring flink.
While the former would be okay-ish I guess, the latter will potentially cause problems as every deployment will result in a re-creation of said configmap by helm. Resetting it to whatever state the chart has defined.

My understanding is that it is a separate config map named flink-config-${clusterId} where clusterId is being set via kubernetes.cluster-id.

3 config maps are created by the pods, dispatcher-leader, resourcemanager-leader, and restserver-leader. There aren't any references from helm to these configmaps, so when helm is updated, it shouldn't affect the configmaps.

Change 679519 had a related patch set uploaded (by Mstyles; author: Mstyles):

[operations/deployment-charts@master] rdf-streaming-updater: enable HA capability

https://gerrit.wikimedia.org/r/679519

So one problem with implementing HA without using the session cluster (T280166) is that Flink can't properly start the Streaming Updater job locally, since the jar needs to be stored in the storage directory. I'm not sure of a workaround if we don't end up using the session cluster for local/ci use cases.