Maniphest T320730

Define access to external resources for GitLab CI Runners
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Jelto
	Oct 13 2022, 2:04 PM

Description

In multiple tasks and discussions the question came up what external resources should be allowed for GitLab CI Runners (T312961, T291978, T295481). External resources is quite generic and this covers multiple areas, which are split up in the following sections. It could make sense to spin off sub-task for individual resources.

General access to public resources (egress traffic)

This is about what outgoing traffic is allowed for Runners. Should Runners be allowed to access internet resources?
Options here could be either fully unrestricted, over the webproxy or disable egress completely.
Currently Shared Runners in WMCS have mostly unrestricted access and Trusted Runners offer egress access over the webproxy.

Public package repositories

CI builds sometimes need additional packages, either for performing CI tasks or for building the artifact. So should Runners be allowed to use common package registries for CI jobs, like pip or npm? Some sources are present/mirrored in WMF infrastructure (like apt repo), some aren't.
Currently all Runners can install packages from public repositories (if available over http/https using the webproxy).

Docker images for CI purposes

Certain CI jobs use pre-build images to perform common tasks like linting, testing or code scans. Should Runners be allowed to run external images for the purpose of certain CI jobs? Please note this is not about base images (next chapter), it's only about what images can be executed during CI jobs to perform certain tasks.

Currently we restrict what images can be executed. The current list contains:

allowed_images = [
  # Everything in Wikimedia registry:
  "docker-registry.wikimedia.org/**/*",
  "docker-registry.discovery.wmnet/**/*",

  # Distributions:
  "centos/*:*",
  "debian:*",
  "fedora:*",
  "opensuse/*:*",
  "ubuntu:*",

  # Language-specific:
  "python:*",
  "ruby:*",
  "rust:*",
  "rustlang/rust:nightly",

  # GitLab upstream - includes security analyzers and terraform images:
  "registry.gitlab.com/gitlab-org/**/*",

see config.toml.
This list is used for both Shared and Trusted Runners. There was some discussion in T312961 of adding additional security scanners which opened the discussion and this task.

Docker base images for building images

What baseimage are allow for building images for wmf/production registry? So what sources should be allowed for directly building artifacts running in production? (base in blubber or FROM field).
Other open questions regarding base images:
Is it possible to restrict this baseimages in buildkitd?
Somehow related docs: https://wikitech.wikimedia.org/wiki/Kubernetes/Images

Difference between Shared and Trusted Runners

Furthermore some of this resources may be different between the different tiers of Runners. Shared Runners could theoretically execute a wider range of images or build non-production images with a wider range of baseimages.
Currently Shared and Trusted Runners have the same access to external resources and Docker images, beside the webproxy. It should be discussed if this is reasonable for the future or if different allow-lists and policies are needed here.

Details

	Subject	Repo	Branch	Lines +/-
	gitlab_runner: block dockerhub on Trusted Runners	operations/puppet	production	+18 -0
	gitlab_runner: make allowed_images list configurable in hiera	operations/puppet	production	+43 -21

Customize query in gerrit

Related Objects

Mentioned In: T333161: self-build/import registry.gitlab.com/gitlab-org/release-cli image
T321316: Self-build and publish buildkit helper images
T295481: Setup GitLab Runner in trusted environment
T320825: Consider adding "official" golang images to list of allowed images for gitlab runners
T312961: Allow registry.gitlab.com/security-products/**/* for gitlab shared runner docker images
Mentioned Here: T333161: self-build/import registry.gitlab.com/gitlab-org/release-cli image
T320825: Consider adding "official" golang images to list of allowed images for gitlab runners
T291978: Limit GitLab shared runners to images from Wikimedia Docker registry
T295481: Setup GitLab Runner in trusted environment
T312961: Allow registry.gitlab.com/security-products/**/* for gitlab shared runner docker images

Event Timeline

Jelto created this task.Oct 13 2022, 2:04 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 13 2022, 2:04 PM

Jelto mentioned this in T312961: Allow registry.gitlab.com/security-products/**/* for gitlab shared runner docker images.Oct 13 2022, 2:14 PM

thcipriani subscribed.Oct 13 2022, 2:27 PM

• ACCDCCDB closed this task as a duplicate of T263780: Move feeds from behind RESTBase to the API Gateway.Oct 13 2022, 2:44 PM

thcipriani reopened this task as Open.Oct 13 2022, 2:45 PM

sbassett subscribed.Oct 13 2022, 2:52 PM

brennen mentioned this in T320825: Consider adding "official" golang images to list of allowed images for gitlab runners.Oct 14 2022, 5:17 PM

Jelto mentioned this in T295481: Setup GitLab Runner in trusted environment.Oct 17 2022, 10:00 AM

LSobanski moved this task from Incoming to Backlog on the collaboration-services board.Oct 18 2022, 3:55 PM

Follow up from the meeting re:

Images that BuildKit uses for internal operations

Our Blubber BuildKit frontend currently uses BuildKit dockerfile2llb package to convert its Dockerfile output to LLB instructions. This is a current dependency but one we're planning to refactor out after Blubber becomes exclusively a BuildKit interface and not a blubberfile-to-dockerfile transpiler. Looking at the implementation of Dockerfile2LLB I can see just one internal image:

docker/dockerfile-copy:v0.1.9@sha256:e8f159d3f00786604b93c675ee2783f8dc194bb565e61ca5788f6a6e9d304061

This image used to dispatch copy operations is overridable via the OverrideCopyImage field of dockerfile2llb.ConvertOpt meaning we could potentially vendor our own version of this image and enforce its use.

However, I think that may be overkill given:

The latest version of this image is 4 years old and it's very minimal.
The image reference uses a specific digest (the sha256:), and...
Container image layers are content addressable and verifiable. Docker verifies the digests when it pulls images, so use of a digest in a reference is as good as using sha256sum to verify whatever binaries it holds.
We're planning on removing the dockerfile2llb dependency in blubber so this isn't going to be a long-term concern.

Thoughts?

Current contents of docker/dockerfile-copy:v0.1.9@sha256:e8f159d3f00786604b93c675ee2783f8dc194bb565e61ca5788f6a6e9d304061:

$ docker save docker/dockerfile-copy:v0.1.9@sha256:e8f159d3f00786604b93c675ee2783f8dc194bb565e61ca5788f6a6e9d304061 | tar Oxf - */layer.tar | tar tf -
bin/
bin/gunzip
bin/gzip
bin/tar
dev/
dev/null
etc/
lib/
lib/ld-musl-x86_64.so.1
lib/libc.musl-x86_64.so.1
proc/
tmp/
usr/
usr/bin/
usr/bin/bunzip2
usr/bin/bzcat
usr/bin/bzcmp
usr/bin/bzdiff
usr/bin/bzegrep
usr/bin/bzfgrep
usr/bin/bzgrep
usr/bin/bzip2
usr/bin/bzip2recover
usr/bin/bzless
usr/bin/bzmore
usr/bin/gunzip
usr/bin/gzexe
usr/bin/gzip
usr/bin/lzcat
usr/bin/lzcmp
usr/bin/lzdiff
usr/bin/lzegrep
usr/bin/lzfgrep
usr/bin/lzgrep
usr/bin/lzless
usr/bin/lzma
usr/bin/lzmadec
usr/bin/lzmainfo
usr/bin/lzmore
usr/bin/tar
usr/bin/uncompress
usr/bin/unlzma
usr/bin/unxz
usr/bin/xz
usr/bin/xzcat
usr/bin/xzcmp
usr/bin/xzdec
usr/bin/xzdiff
usr/bin/xzegrep
usr/bin/xzfgrep
usr/bin/xzgrep
usr/bin/xzless
usr/bin/xzmore
usr/bin/zcat
usr/bin/zcmp
usr/bin/zdiff
usr/bin/zegrep
usr/bin/zfgrep
usr/bin/zforce
usr/bin/zgrep
usr/bin/zless
usr/bin/zmore
usr/bin/znew
usr/lib/
usr/lib/liblzma.so.5
usr/lib/liblzma.so.5.2.4
usr/libexec/
usr/libexec/rmt
var/

Change 844434 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/puppet@production] gitlab_runner: make allowed_images list configurable in hiera

https://gerrit.wikimedia.org/r/844434

gerritbot added a project: Patch-For-Review.Oct 19 2022, 8:10 AM

In T320730#8326445, @dduvall wrote:

Follow up from the meeting re:

Images that BuildKit uses for internal operations

Our Blubber BuildKit frontend currently uses BuildKit dockerfile2llb package to convert its Dockerfile output to LLB instructions. This is a current dependency but one we're planning to refactor out after Blubber becomes exclusively a BuildKit interface and not a blubberfile-to-dockerfile transpiler. Looking at the implementation of Dockerfile2LLB I can see just one internal image:

docker/dockerfile-copy:v0.1.9@sha256:e8f159d3f00786604b93c675ee2783f8dc194bb565e61ca5788f6a6e9d304061

This image used to dispatch copy operations is overridable via the OverrideCopyImage field of dockerfile2llb.ConvertOpt meaning we could potentially vendor our own version of this image and enforce its use.

However, I think that may be overkill given:

The latest version of this image is 4 years old and it's very minimal.

The image reference uses a specific digest (the sha256:), and...

Container image layers are content addressable and verifiable. Docker verifies the digests when it pulls images, so use of a digest in a reference is as good as using sha256sum to verify whatever binaries it holds.

We're planning on removing the dockerfile2llb dependency in blubber so this isn't going to be a long-term concern.

Thoughts?

Thanks Dan for the follow up and looking into the buildkit internals. My suggestion would be to create a non-blocking subtask for self-building the docker/dockerfile-copy image. I don't think we strictly need that immediately and can continue with the sha-referenced public one. But if it's not too complicated I'd like to try to build that single image ourselves and set the override. I have some concerns that docker/dockerfile-copy may not get a lot of attention from Docker inc if it was not updated since 4 years. We may have to update this image for some security patches in the future anyways.

I can also take a look at that if you like (once I find the actual code/repo for dockerfile-copy image).

Different `allowed_images` for Trusted and Shared Runners

Some more follow ups from yesterdays discussion is to use two different allowed_images lists for Shared and Trusted Runners. The Trusted Runners should only be allowed to execute images we control. So production builds are not depending on images and code provided by a third party. Shared Runners can execute also external images, like common Dockerhub images, popular language base images and images provided by gitlab-org. This should reduce friction for initial development and non-production projects and give us more flexibility in what jobs can be executed outside of the Trusted Build process (code scans, lints, ...).

In the change above two separate allowed_images list are implemented. Once this is merged T312961 and T320825 should be unblocked.

Restrict base images for buildkit

We also agreed that we want to keep the policy of allowing wmf base images for production builds only. So public images, like dockerhub, should be forbidden as base images for the Trusted Runner buildkit configuration. This means buildkit/blubber(?) needs the same policy we are using for the current production builds. @dduvall I guess you can explain that better :)

Thanks again all for all the input and the discussion!

Just my 2 cents. I think the attack vectors from that image (I am happy this appears to be the only image) appear few and the surface small (yet).

That being said, that image uses musl as a libc implementation in there and if it is 4 years old we have we have 1 DoS and 2 overflow CVEs[1]. Another lib, liblzma (shipped by xz-utils) has had one this year[2] and it probably is applicable in that image (as it goes at least back to stretch in Debian terms which matches up with those 4 years) as well.

[1] https://www.cvedetails.com/product/39652/Musl-libc-Musl.html?vendor_id=16859
[2] https://security-tracker.debian.org/tracker/CVE-2022-1271

So overall, while the attack surface might not be particularly large and it would probably take some effort to exploit the above, that attack surface will become larger over time. So, while I wouldn't prioritize as High priority, I +1 Jelto's prudence to want to rebuild and ship internally that image.

In T320730#8329379, @akosiaris wrote:

So overall, while the attack surface might not be particularly large and it would probably take some effort to exploit the above, that attack surface will become larger over time. So, while I wouldn't prioritize as High priority, I +1 Jelto's prudence to want to rebuild and ship internally that image.

Thank you for pointing out those CVEs. In light of that, I hardily +1 the approach as well—building our own version of dockerfile-copy. It should be simple to enforce in the Blubber buildkit frontend, and maybe we won't even need it after refactoring Blubber to use LLB directly.

Change 844434 merged by Jelto:

[operations/puppet@production] gitlab_runner: make allowed_images list configurable in hiera

https://gerrit.wikimedia.org/r/844434

Maintenance_bot removed a project: Patch-For-Review.Oct 20 2022, 10:30 AM

Jelto mentioned this in T321316: Self-build and publish buildkit helper images.Oct 20 2022, 3:01 PM

A short update about the current state of this task. After the last meeting some changes were deployed addressing egress traffic and allowed Docker images on Shared and Trusted Runners. So the following things are defined and implemented:

General access to public resources (egress traffic) was defined, Trusted and Shared Runners have egress firewall rules
Allowed Docker images for CI purposes are defined, Trusted and Shared Runners have individual allowed_images lists
Allowed Docker base images for building images were defined (only wmf registry base images)

What's still open:

Implement/research ways to implement allowed Docker base images in buildkitd
Define how to deal with public package repositories on Shared and Trusted Runners
Define this policies for Cloud Runners. Is it similar to Shared Runners or do we offer different policies here?

Jelto moved this task from Backlog to Work in Progress on the collaboration-services board.Nov 22 2022, 3:40 PM

I think policy is now that Trusted runners only use internal images; would it be possible therefore to start making an internal copy of registry.gitlab.com/gitlab-org/release-cli, please? It's really useful if wanting to make gitlab releases from CI (which would be handy as and when I want to start making .debs from CI) - e.g. currently being used for https://gitlab.wikimedia.org/repos/data_persistence/wmf-beamer-style releases.

Jelto moved this task from Work in Progress to Backlog on the collaboration-services board.Dec 19 2022, 3:55 PM

In T320730#8420142, @MatthewVernon wrote:

I think policy is now that Trusted runners only use internal images; would it be possible therefore to start making an internal copy of registry.gitlab.com/gitlab-org/release-cli, please? It's really useful if wanting to make gitlab releases from CI (which would be handy as and when I want to start making .debs from CI) - e.g. currently being used for https://gitlab.wikimedia.org/repos/data_persistence/wmf-beamer-style releases.

Yes that's right, Trusted Runners can not run images from external registries. I opened T333161 to address this and import release-cli to our own registry.

LSobanski moved this task from Backlog to Work in Progress on the collaboration-services board.Oct 9 2023, 3:49 PM

Change 965157 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/puppet@production] gitlab_runner: block dockerhub on Trusted Runners

https://gerrit.wikimedia.org/r/965157

gerritbot added a project: Patch-For-Review.Oct 11 2023, 1:26 PM

Change 965157 merged by Jelto:

[operations/puppet@production] gitlab_runner: block dockerhub on Trusted Runners

https://gerrit.wikimedia.org/r/965157

Maintenance_bot removed a project: Patch-For-Review.Oct 12 2023, 7:30 AM

In T320730#8413614, @Jelto wrote:

What's still open:

Implement/research ways to implement allowed Docker base images in buildkitd

This is done in https://gerrit.wikimedia.org/r/965157. Unfortunately there is no clean way to restrict docker or buildkit to a private registry (see issue). So all docker traffic to Dockerhub is rejected by a firewall rule. If a image uses a dockerhub baseimage on the Trusted Runners, buildkit will get a connect: connection refused error.

Define this policies for Cloud Runners. Is it similar to Shared Runners or do we offer different policies here?

The Cloud Runners in Digital Ocean are configured open on purpose. They don't have a image restriction and accept a wider range of jobs (except the trusted build jobs for production). So this is also done.

Define how to deal with public package repositories on Shared and Trusted Runners

This is quite a complex topic. Public packages and code come from a wide variety of sources like apt, pip, npm, Github or Debian upstream repos. As far as I know we don't have a technical restriction that block certain sources in Gerrit/Jenkins CI. All public sources can be used. There is just a common understanding to not use unreviewed packages and prefer more trusted sources like apt.
We could try to allow-list each source individually source for the Trusted Runners. But I think that creates quite some overhead and makes transitioning to GitLab for teams harder. So I'm leaning towards the same approach with the Trusted Runners. Allow public packages (except for Dockerhub).

I'm happy about other opinions or feedback either here in the task or as a separate task. If others don't see the need to restrict public packages on the Trusted Runners I'll close the task soon.

I think we are mostly settled about which runners have which kind of access to wmf and external infrastructure. Also the permission to this runners seems to work as expected (default access to cloud Runners, opt-in access to Trusted Runners).

So I compiled a matrix which shows the status quo: https://wikitech.wikimedia.org/wiki/GitLab/Gitlab_Runner#Permission_matrix. This should cover the main differences. I may add one or two more lines.

I'll close the task.

Define access to external resources for GitLab CI RunnersClosed, ResolvedPublicActions

Description

General access to public resources (egress traffic)

Public package repositories

Docker images for CI purposes

Docker base images for building images

Difference between Shared and Trusted Runners

Details

Related Objects

Event Timeline

Images that BuildKit uses for internal operations

Images that BuildKit uses for internal operations

Different allowed_images for Trusted and Shared Runners

Restrict base images for buildkit

Define access to external resources for GitLab CI Runners
Closed, ResolvedPublic
Actions

Different `allowed_images` for Trusted and Shared Runners