Page MenuHomePhabricator

gitlab: consider enabling docker container registry
Open, Stalled, MediumPublicFeature

Description

Gtilab is capable of having a docker container registry per repository, see here
https://gitlab.wikimedia.org/help/user/packages/container_registry/index.md

There are capacity & storage questions to be considered, and perhaps the answer is to enable the docker registry only on-demand, on a per-repo basis, which I ignore if gitlabs allows.

Some example use cases:

  • For Toolforge, for example, we already have one docker registry, which can be browsed at https://docker-registry.toolforge.org/. This registry is, however, managed by hand, poorly integrated into code repositories. Replacing it (or parts of it) with gitlab-generated registries would be really nice.
  • Also for Toolforge, we're currently experimenting with harbor https://goharbor.io/ to allow per-tool docker registries. Our plan is to keep working with harbor for now, but if gitlab had this feature enabled, then we could certainly add another potential solutions to the kind of problems we're trying to solve.
  • I can also think on another use case: to generate gitlab CI-related docker container images from gitlab itself. Storing those images meant to be used only for gitlab in a registry maintained by gitlab seems elegant and a legit use case.

These are just 3 simple examples. Perhaps others may have more ideas, but for now, I wanted to capture them in this ticket.

Event Timeline

aborrero triaged this task as Medium priority.Mar 28 2022, 2:59 PM
aborrero moved this task from Inbox to Watching on the cloud-services-team (Kanban) board.
aborrero changed the subtype of this task from "Task" to "Feature Request".

generate gitlab CI-related docker container images from gitlab itself

data engineering and platform engineering would love to be able to do this. See: T304450: Create conda .deb and docker image cc @gmodena

The idea to have this gitlab CI-related container images stored in gitlab itself came from this PoC:

1-- https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-framework-api/-/blob/main/.gitlab-ci.yml

This repo has a gitlab-ci.yaml file that points to another repository:

include:
 - https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/raw/main/py3.9-buster-tox/gitlab-ci.yaml

2-- https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/blob/main/py3.9-bullseye-tox/gitlab-ci.yaml
The included gitlab-ci.yaml file contains the definition for a simple python tox test. But it uses a docker image, where is that image?

image: "docker-registry.tools.wmflabs.org/cloud-cicd-py39bullseye-tox:latest"

tox:
  stage: test
  script:
    - tox

3-- https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/blob/main/py3.9-bullseye-tox/Dockerfile

FROM docker-registry.tools.wmflabs.org/python:3.9-slim-bullseye
RUN pip install tox

The image can be built from this same repository, and hosted on a dedicated docker registry. For this example, since we don't have a dedicated docker registry in gitlab we're using the Toolforge docker registry.
Also, in this example the base image (python:3.9-slim-bullseye) has been cached as well on the Toolforge docker registry (given the ratelimits on docker hub).

This setup just works, see here for example: https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-framework-api/-/pipelines/2768

See T307537: Assess GitLab Container Registry as a default for container build processes for relevant discussion here. I think consensus in RelEng is that it would be useful to enable this, although it's not a strict blocker to an image publishing pathway.

A blocker to anything other than limited experimental use is T307142: bring new gitlab hardware servers into production, since we're currently pretty constrained on storage.

Change 790778 had a related patch set uploaded (by Brennen Bearnes; author: Brennen Bearnes):

[operations/puppet@production] GitLab: enable container registry

https://gerrit.wikimedia.org/r/790778

thcipriani changed the task status from Open to Stalled.Jun 21 2022, 6:19 PM
thcipriani subscribed.

After chatting with individual RelEngers, I wanted to add some clarification here.

From RelEng's perspective:

  1. We're not enabling the registry in the near term
  2. We'd like to use JSON Web Tokens for pushing to our production registry as our build process: T308501: Authenticate trusted runners for registry access against GitLab using temporary JSON Web Token (we're working with SRE to get that done)
  3. GitLab's registry has a lot of nice features, but if we decide to go that route we'll ensure we discuss with both SRE and WMCS before we do

In the interim, marking this as stalled until we are ready to have those discussions.

Aw, that is unfortunate! We were really hoping for this soon to help speed up CI times by prebuilding CI images. Several of our CI pipelines use conda environments and/or Spark, and installing all that from scratch for each pipeline run makes tests take 7 minutes or more.

Aw, that is unfortunate! We were really hoping for this soon to help speed up CI times by prebuilding CI images. Several of our CI pipelines use conda environments and/or Spark, and installing all that from scratch for each pipeline run makes tests take 7 minutes or more.

To my understanding, you can prebuild CI images now too. See https://gerrit.wikimedia.org/r/plugins/gitiles/integration/config/+/master/dockerfiles/ for examples.

@hashar should be able to provide guidance if that is the best approach for what you want to do.

Hm, yeah @hashar had told me not to do this, because the intention was to eventually deprecate integration/config dockerfiles in favor of gitlab. Perhaps since gitlab won't support this soon, we should use integration/config now?

Change 790778 abandoned by Brennen Bearnes:

[operations/puppet@production] GitLab: enable container registry

Reason:

https://gerrit.wikimedia.org/r/790778

I'm exploring producing some Docker image build artifacts and saw this wasn't enabled in our install

Would be really cool if I could use Gitlab's container registry at some point, but I understand there are concerns which may need to be addressed first

Quick update on toolforge side, we are currently using harbor (https://goharbor.io) for tool images, and for toolforge images too.

We are investigating how to build images on gitlab and push to our harbor instance for our system components (T336130: Automatically build Toolforge infrastructure container images in GitLab), but we are not blocked by needing an image repository.
It's not a public repository though, so you can't push whatever you want there, but would be interesting eventually to reuse gitlab build system if possible (we are pretty tied to buildpacks though).