Page MenuHomePhabricator

Flink application and flink-kubernetes-operator production docker images
Closed, ResolvedPublic

Description

As an event-stream developer I want to have access to a base flink image to use with k8s.

https://gerrit.wikimedia.org/g/wikidata/query/flink-rdf-streaming-updater was created with "Application Mode" deployment in mind but we then switched to a "Session Cluster" deployment approach, this image no longer references any job specific information and thus is appropriate to use as a reusable image.

We should probably rename it and/or move it to a place where it is more obvious that it can be re-used across different projects.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 858356 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/docker-images/production-images@master] WIP flink image

https://gerrit.wikimedia.org/r/858356

Writing down some ideas and thoughts from todays talk with @gmodena:

  • We will only support Application Mode Deployments, no session clusters
  • We will only support running in k8s, not regular docker / docker-compose. This allows us to keep the flink image entrypoint simpler. The upstream one mangles the flink-conf.yaml file with some defaults and settings from FLINK_PROPERTIES env var, but we would prefer to not mess with this file in the image directly, but instead provide it via usual k8s configmap and help templates.
  • Application images that are actually deployed will be built FROM this base Flink image using Deployment Pipeline. Since we are only going to support Application Mode, the base Flink image will not be useful on its own.

Status update!

flink and flink-kubernetes-operator images are ready for review.

I've made some changes to upstream's Dockerfiles and entrypoints for these. Notably:

  • flink-kubernetes-operatore webhook is not supported. IIUC, we don't use a webhook like this in production, but instead use our own mechanism to provide TLS stuff?
    • Because of this, we will need to either always set webhook.create: false in our flink-operator helm values if we are using the upstream helm chart, OR, if/when we have our own version, just remove all the webhook bits, cc @bking for T321491 (should we make a specific task for the flink helm bits?)
  • Removed unneeded docker-entrypoint.sh logic for the flink image. If we are only supporting running in k8s, upstream's flink-docker docker-entrypoint.sh is not useful.

There are still a few TODOs from me in the code, mostly around figuring out exactly what flink plugins and other dependencies to include in this default image.

  • flink-kubernetes-operatore webhook is not supported. IIUC, we don't use a webhook like this in production, but instead use our own mechanism to provide TLS stuff?
    • Because of this, we will need to either always set webhook.create: false in our flink-operator helm values if we are using the upstream helm chart, OR, if/when we have our own version, just remove all the webhook bits, cc @bking for T321491 (should we make a specific task for the flink helm bits?)

I might be missing bits but I don't think the webhook has something to do with TLS (apart from the fact that it needs to present a certificate to the apiserver that can be trusted). The webhook are usually around for additional validation (or mutation) of K8s objects created by computers or humans. Although I'm not really sure what it does in case of flink it gets registered for flinkdeployments and flinksessionjobs (https://github.com/apache/flink-kubernetes-operator/blob/main/helm/flink-kubernetes-operator/templates/webhook.yaml). I'm not sure if the cert-manager Certificate/Issuer stuff in that file should just works in clusters with cert-manager + cfssl-issuer enabled but it absolutely might. The cert-manager.io/inject-ca-from annotation to the webhooks tells the cert-manager cainjector to populate the CA from the given secret as caBundle to the *WebhookConfiguration where the Kubernetes API loads it from to verify the connection to the webhook.

I don't think the webhook has something to do with TLS

Ah okay, I am super green here and don't have much experience writing helm outside of doing so for services in our deployment-charts. Ben explained a bunch in IRC to me too. He said I could paste his messages here.

@BTullis wrote:

Re the webhook, longer term I'm definitely in favour of using it for the spark-operator. It's optional, but offers some functionality that would be really useful. Namely things like https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/user-guide.md#requesting-gpu-resources and https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/user-guide.md#mounting-a-configmap-storing-hadoop-configuration-files

However, I'm removing it for the moment because of the way support for it was implemented in the helm chart by the upstream project. The best explanation I've written for why is in this commit: https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/864770

In basic terms, it creates a keypair and an auth token in this hacky script: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/hack/gencerts.sh
It sends these to the K8S API. I was going to try to get this working while saying "I'll probably improve this later..." but in the end I decided that it would be better to switch it off for now and use the puppet secret mechanism and cert-manager or whatever for doing the TLS, when I get around to it.

For the flink webhook, it's a bit less clear to me what the benefits of it would actually be. We can see from the helm chart a little bit about what the functionality would be: https://github.com/apache/flink-kubernetes-operator/blob/main/helm/flink-kubernetes-operator/templates/webhook.yaml#L94-L105
So it can validate any create or update operation on`flinkdeployments` and flinksessionjobs. As I understand it, this is like an extra layer of access control, determining whether or not this object can be created or updated.
I suspect that in this case we an do without this additional level of access control, given that we're in a pretty well controlled environment.
It can also *mutate* any create operation on a flinksessionjob: https://github.com/apache/flink-kubernetes-operator/blob/main/helm/flink-kubernetes-operator/templates/webhook.yaml#L94-L105
It's not clear to me from the docs what this mutation might do, but like the spark operator it's something about adding annotations. Here's a PR showing that it can be used to add labels to sessionjobs: https://github.com/apache/flink-kubernetes-operator/pull/265

So in short, I think that you can probably get away without it for now :-)

Ottomata renamed this task from Create a shared flink docker image to Flink application and flink-kubernetes-operator production docker images.Dec 6 2022, 3:24 PM

Change 876249 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/deployment-charts@master] Update flink-kubernetes-operator chart with upstream changes for 1.3.0

https://gerrit.wikimedia.org/r/876249

Change 858356 merged by Ottomata:

[operations/docker-images/production-images@master] flink and flink-kubernetes-operator image

https://gerrit.wikimedia.org/r/858356

Change 876249 merged by jenkins-bot:

[operations/deployment-charts@master] Update flink-kubernetes-operator chart with upstream changes for 1.3.0

https://gerrit.wikimedia.org/r/876249

Change 877193 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/puppet@production] Add flink to profile::docker::builder::known_uid_mappings

https://gerrit.wikimedia.org/r/877193

Change 877193 merged by Ottomata:

[operations/puppet@production] Add flink to profile::docker::builder::known_uid_mappings

https://gerrit.wikimedia.org/r/877193

Change 877230 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/docker-images/production-images@master] flink-kubernetes-operator - use explicit mvn proxy settings instead of java.net.useSystemProxies

https://gerrit.wikimedia.org/r/877230

Change 877230 merged by Ottomata:

[operations/docker-images/production-images@master] flink-kubernetes-operator - use explicit mvn proxy settings instead of java.net.useSystemProxies

https://gerrit.wikimedia.org/r/877230

Change 877237 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/docker-images/production-images@master] flink-kubernetes-operator - fix command that sets MVN_HTTP(S)_PROXY_OPTION

https://gerrit.wikimedia.org/r/877237

Change 877237 merged by Ottomata:

[operations/docker-images/production-images@master] flink-kubernetes-operator - fix command that sets MVN_HTTP(S)_PROXY_OPTION

https://gerrit.wikimedia.org/r/877237

Change 877241 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/docker-images/production-images@master] flink-kubernetes-operator - add -Dmaven.antrun.skip=true to mvn package

https://gerrit.wikimedia.org/r/877241

Change 877241 merged by Ottomata:

[operations/docker-images/production-images@master] flink-kubernetes-operator - add -Dmaven.antrun.skip=true to mvn package

https://gerrit.wikimedia.org/r/877241

Change 878178 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/docker-images/production-images@master] flink - include examples in image

https://gerrit.wikimedia.org/r/878178

Change 878178 merged by Ottomata:

[operations/docker-images/production-images@master] flink - include examples in image

https://gerrit.wikimedia.org/r/878178

Change 879050 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/docker-images/production-images@master] flink - Add examples/wikimedia with simple table datagen -> print pipeline

https://gerrit.wikimedia.org/r/879050

Change 879050 merged by Ottomata:

[operations/docker-images/production-images@master] flink - Add examples/wikimedia with simple table datagen -> print pipeline

https://gerrit.wikimedia.org/r/879050

Hm, am confused by a production-images vs blubber user thing.

In operation/production-images, we have a known_uid_mappings (also in puppet) which I assumed would be the run user for the container in prod.

However, blubber seems to use 'somebody' as the build user and file owner, and 'runuser' as the USER the container runs processes as.

Should we change this? Should we set the runs.as to something different when building images based of of the production-images flink image with blubber?

Change 881011 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/docker-images/production-images@master] flink 1.16.0-wmf3

https://gerrit.wikimedia.org/r/881011

Change 881011 merged by Ottomata:

[operations/docker-images/production-images@master] flink 1.16.0-wmf3

https://gerrit.wikimedia.org/r/881011

Should we change this? Should we set the runs.as to something different when building images based of of the production-images flink image with blubber?

I think this is up to you all, and I don't know enough about flink to say. In general, if the effective runtime user needs access to things that are only user or group readable or writable by a different user that's already provided by the base image, that would be a case where overwriting runs.as, runs.uid and runs.gid would make sense. If that's not the case, I would just go with the default behavior which is the most restrictive in terms of effective runtime permissions of files/directories within the container.

If that's not the case, I would just go with the default behavior which is the most restrictive in terms of effective runtime permissions of files/directories within the container.

@gmodena maybe we should make the build stages run.as flink, but the production (and test?) run stages as the default runuser? I guess the problem is the log/ directory. Hm.

Change 883660 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/deployment-charts@master] flink-app-example - set upgradeMode: stateless

https://gerrit.wikimedia.org/r/883660

Change 883660 merged by Ottomata:

[operations/deployment-charts@master] flink-app-example - set upgradeMode: stateless

https://gerrit.wikimedia.org/r/883660

FYI, in order to make pyflink work with this image as well, we changed our installation method to pip install apache-flink, instead of downloading a Flink distro tarball. See T327494: Flink docker image should work with pyflink for more info.