Page MenuHomePhabricator

Create namespaces and kubernetes users for spark-operator and for spark jobs
Closed, ResolvedPublic

Description

As part of the tesing process for running Spark jobs on Kubernetes, we need to be able to deploy a spark-operator into its own namespace.

Once the operator is running it is configured to monitor one or more namespaces for SparkApplication requests.
This can also be omitted, so that it watches all namespaces.

At this point in the process, I believe that we should initially create:

  • a spark-operator namespace where we run the operator
  • a spark namespace where we run the driver
  • a spark-operator user
  • a spark user

The spark-operator will use the standard deployment-pipeline and be managed by SREs in the Data Engineering and ML teams using helmfile.

The spark jobs will be submitted by members of analytics-privatedata-users.

Event Timeline

@BTullis What about the spark-executors? Will they run in the same way as the spark driver?

Change 849558 had a related patch set uploaded (by Btullis; author: Btullis):

[labs/private@master] Add dummy deployment users and tokens for spark-operator and spark

https://gerrit.wikimedia.org/r/849558

@BTullis What about the spark-executors? Will they run in the same way as the spark driver?

@JAllemandou - Yes, I believe that they will run in the same way, althouth there is additional flexibility if we wish it.

Here is the test case that I have been running so far: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/examples/spark-pi.yaml

The spark-operator has various configuration options about which namespaces and service accounts to use.
In the simple case that I've been running so far on minikube, the operator, driver, and executor pods have all been running in the same namespace.
You can see this test in action here, with one pod of each type.

image.png (934×1 px, 139 KB)

When moving this from minikube to the real kubernetes cluster, I would separate out the spark-operator, so that it runs in its own namespace and uses its own service account.

I don't currently see a need to run the drive and executors with different users or in different namespaces from each other though.
It seems that they're both effectively 'user-level' processes, so it makes more sense to me to group them together.

However, If we do decide to run them under different service accounts, I believe we can do that. See:

The next stage in my test is to run the same example, but substitute the docker images for those that we have built ourselves. i.e.

  • gcr.io/spark-operator/spark:v3.1.1 -> docker-registry.wikimedia.org/spark:3.3.0
  • ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.7-3.1.1 -> v1beta2-1.3.7-3.1.1

I am aware of the discrepancy between the 3.1.1 and 3.3.0 version numbers.
So far I've had trouble building a Spark 3.1.1 distribution against Hadoop 2.10.2, whereas 3.3.0 is working. However, I'm watching this closely. https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1559

Change 849558 merged by Btullis:

[labs/private@master] Add dummy deployment users and tokens for spark-operator and spark

https://gerrit.wikimedia.org/r/849558

Thanks for the details @BTullis :)
I expect no big difference in running a spark-pi example with 3.3 or 3.1.
The kubernetes integration on the other hand could be something that has changed.

I have now created the tokens in the private puppet repository to match those in the labs/private repository.

I'll now make a CR to add the namespaces and follow it up the the changes to configure the kubectl environment files.

Change 854498 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/deployment-charts@master] Add namespaces for spark and spark-operator

https://gerrit.wikimedia.org/r/854498

Change 854505 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Configure the kube_env file for the spark-operator namespace

https://gerrit.wikimedia.org/r/854505

Change 854498 merged by jenkins-bot:

[operations/deployment-charts@master] Add namespaces for spark and spark-operator

https://gerrit.wikimedia.org/r/854498

Change 854505 merged by Btullis:

[operations/puppet@production] Configure the kube_env file for the spark-operator namespace

https://gerrit.wikimedia.org/r/854505