Allow a shared, protected runner for the data-engineering group in GitLab
Closed, DeclinedPublic
Actions

Assigned To

Authored By

	BTullis
	Nov 4 2021, 2:44 PM

Description

We would like to be able to create a new GitLab runner that would only be available for the Data-Engineering team's pojects: https://gitlab.wikimedia.org/data-engineering

I understand that at the moment it would take someone with Owner level access to the group to be able to access the runner settings and therefore register a new runner.
https://gitlab.wikimedia.org/groups/data-engineering/-/group_members
Would you be happy to grant me (as an SRE within the team) ownership of that group, so that I could perform this operation as a self-service task?

It might be helpful if the runner could use the docker executor, but if that isn't feasible then we would also be happy to use a runner with the shell executor.

The first project for which we would like to use this runner is data-engineering/airflow-dags.

This runner was first discussed here: T286958#7450771

Some of the tasks that this runner should be capable of achieving are:

accessing our current Airflow instances, which are in the Analytics VLAN.
accessing HDFS, which requires the use of Kerberos
uploading artefacts (Jars and python wheels) to Archiva

Therefore we would like to site a runner within the analytics VLAN, provide it with Kerberos keytabs as required, and begin to use this for both CI and automated code deployments.

There is a possibility of creating a 'test-analytics' runner first, which only has access to the test-hadoop cluster, before promoting this configuration to production.

Many thanks.

Related Objects

Mentioned In: T321736: Modify conda-analytics CI pipeline to use a custom gitlab runner that can run docker
T292094: Limit GitLab shared runners to trusted contributors
T286958: Document long-term requirements for GitLab job runners
Mentioned Here: T321736: Modify conda-analytics CI pipeline to use a custom gitlab runner that can run docker
T317210: Create a new airflow package for version 2.3.2
T292094: Limit GitLab shared runners to trusted contributors
T295481: Setup GitLab Runner in trusted environment
T286958: Document long-term requirements for GitLab job runners

Event Timeline

BTullis created this task.Nov 4 2021, 2:44 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 4 2021, 2:44 PM

BTullis updated the task description. (Show Details)Nov 4 2021, 2:56 PM

BTullis mentioned this in T286958: Document long-term requirements for GitLab job runners.

mforns subscribed.Nov 4 2021, 4:07 PM

BTullis added a project: Data-Engineering-Kanban.Nov 4 2021, 4:13 PM

BTullis added a project: Data Pipelines.

BTullis mentioned this in T292094: Limit GitLab shared runners to trusted contributors.Nov 4 2021, 4:20 PM

mforns moved this task from Backlog to Estimated on the Data Pipelines board.Nov 5 2021, 8:16 PM

mforns triaged this task as Medium priority.Nov 5 2021, 9:48 PM

odimitrijevic assigned this task to BTullis.Nov 8 2021, 5:17 PM

Currently there are only Runners in WMCS. We first have to adapt this setup to also have a set of Runners in a trusted environment outside of WMCS. The task T295481 is used to create such runners. If we have secure Runners in place we can use this to create additional, special purpose runners (like this one fore data-engineering).

So this task is a little bit blocked until we have created such secure Runners. But I think some things could happen in parallel here, like figuring out the networking requirements with networking folks.

It might be helpful if the runner could use the docker executor, but if that isn't feasible then we would also be happy to use a runner with the shell executor.

From security perspective shell executors should not be used as they don't offer any real separation of jobs and from the Runner. So we should use at least the Docker Executors (see).

Thanks for the reply @Jelto,

Perhaps I or one of my team could help with the set up this particular secure runner, given that it is quite a specialized use case and that it's not going to be used for anything to do with Mediawiki deployments, or anything like that.

I note @brennen's comment here: T292094#7493614

For very specialized runners, I do suspect a bring-your-own approach is best.

So I'm more than happy to work with you on this specialized requirement, if that's feasible.

From security perspective shell executors should not be used as they don't offer any real separation of jobs and from the Runner. So we should use at least the Docker Executors (see).

The docker executor is fine with us too, it's the most convenient and flexible solution by far, but equally we'd be happy to manage the security issues around the shell executor as well, given that we would only be using it for trusted builds.

What's the best thing that I can do to help keep this from being blocked for too long? Should I write a document with a proposal for a data engineering runner setup that your team can review?

brennen moved this task from Inbox to CI & Job Runners on the GitLab board.Nov 18 2021, 5:24 PM

brennen edited projects, added GitLab (CI & Job Runners); removed GitLab.

gmodena subscribed.Nov 19 2021, 10:02 AM

gmodena unsubscribed.

gmodena subscribed.

Pausing this task, since we are not currently working on it. I think that it would still be useful to have a meeting with the release engineering team to discuss whether or not this is a desired path forward.

• EChetty moved this task from Estimated to Backlog on the Data Pipelines board.Apr 11 2022, 3:47 PM

Declining this task as we have no time to work on it at the moment.

Reopening the task since it is still something that would be of value to the data-engineering team.

To illustrate, we currently have a requirement to build an airflow .deb package (T317210) and @Antoine_Quhen has developed a GitLab-CI pipeline that is defined in the same repo as our Airflow DAGS.

Unfortunately, we cannot yet make use of this pipeline because we can't execute docker within the context of the shared or trusted runners.

As a workaround, we have copied all of the steps from the Dockerfile into the .gitlab-ci.yml file in order to create a second pipeline.

So we were wondering whether it would be feasible for us to bring our own runner and register it with either the data-engineering group or the data-engineering/airflow-dags project, in order to allow us to use this feature.

Could we possibly discuss some of the deployment scenarios that might permit this please? I have a few questions to start with... Perhaps @Jelto would be well placed to advise us?

Is it right to say that we still can't use privileged mode for gitlab-runners in production, for security reasons?
Is the use of podman an option for us?
What about kaniko or buildah? Has any research been carried out into the use of these tools?

We're happy to deploy a machine to WMCS if that's the only way. However, it might limit some of the things we could use this runner for in future if it can't be made to run in the production realm, so I'd be keen to explore all options.

Would you be happy to grant me (as an SRE within the team) ownership of that group, so that I could perform this operation as a self-service task?

Confirming here you're an owner of the data-engineering group in gitlab

Ottomata subscribed.Oct 20 2022, 12:53 PM

In T295045#8332237, @thcipriani wrote:

Would you be happy to grant me (as an SRE within the team) ownership of that group, so that I could perform this operation as a self-service task?

Confirming here you're an owner of the data-engineering group in gitlab

Thanks for that @thcipriani - Yes, I have the necessary rights now.
When I originally wrote this ticket, the data-engineering group hadn't yet been moved under the /repos top-level group and I didn't have the privileges at the time. I'll update the description to reflect the changes since then.

• EChetty moved this task from Backlog to To be discussed /To be estimated on the Data Pipelines board.Oct 20 2022, 4:13 PM

xcollazo subscribed.Oct 20 2022, 4:16 PM

• EChetty moved this task from To be discussed /To be estimated to Discussed (Radar) on the Data Pipelines board.Oct 20 2022, 4:16 PM

xcollazo mentioned this in T321736: Modify conda-analytics CI pipeline to use a custom gitlab runner that can run docker.Oct 26 2022, 8:43 PM

Just passing by to +1 this idea.

@BTullis mentioned above the airflow-dags project, but it would also immediately benefit the conda-analytics project, as we are effectively doing the same workaround of copying the Dockerfile into gitlab CI steps. See T321736 for details.

Antoine_Quhen awarded a token.Oct 27 2022, 9:22 AM

JArguello-WMF edited projects, added Data Engineering and Event Platform Team; removed Data Pipelines.Jun 30 2023, 5:43 PM

JArguello-WMF moved this task from Data Eng Backlog to Radar (External Teams) on the Data Engineering and Event Platform Team board.

This is no longer necessary, due to the work undertaken by the Release-Engineering-Team on GitLab.
It is possible to gain access to trusted runners on a per-project basis.

Allow a shared, protected runner for the data-engineering group in GitLabClosed, DeclinedPublicActions

Description

Related Objects

Event Timeline

Allow a shared, protected runner for the data-engineering group in GitLab
Closed, DeclinedPublic
Actions