Page MenuHomePhabricator

Experiment with Zuul to GitLab CI pipeline delegation
Closed, ResolvedPublic

Description

Adopting Zuul as a primary CI system for some GitLab hosted repos will afford us valuable gating and merging functionality (see T349872: Investigate and document "Depends-On" in GitLab), but we lose CI standardization and take on the burden of supporting (managing/documenting/implementing) an additional type of CI configuration.

If we promote the use of .zuul.yaml files in individual repos, we retain CI self service. However, it's an unknown whether or not the Ansible driven configuration of Zuul is sufficient for end users, and while GitLab CI has gone through community consultation, Zuul has not. If we manage the Zuul/Ansible configuration separately from end-user repos, we lose CI self service altogether. Neither of these options seem like great outcomes.

There may be an alternative, however, whereby we benefit from Zuul gating and retain CI self-service in the canonical form of .gitlab-ci.yml end-user configuration.

  1. Integrate Zuul with GitLab for gating/merging.
    1. Zuul will continue to be the authoritative system that approves/unapproves/merges/comments on GitLab MRs.
    2. Zuul will continue to provision speculative repo state for the dependent and dependency repos.
  2. Manage all (or most) Zuul configuration separately from individual repos in a Zuul trusted project, including Zuul tenants, pipelines, projects, and jobs.
  3. Define a standard Zuul job for all GitLab repos (e.g. gitlab-pipeline) that:
    1. Creates a GitLab pipeline for the change (MR) using the GitLab API, passing the Zuul build UUID as a pipeline variable.
    2. Exposes the speculative repo state, which includes all dependent repos, to the GitLab CI jobs via something like rsyncd/ssh/git. A shared volume may also be possible, however such an implementation would have to be possible with both k8s and Docker based GitLab runners.
    3. Waits for the completion of the GL pipeline and fails/succeeds based on the result.
  4. The repo-configured GitLab pipeline would in turn.
    1. Configure workflow rules such that only Zuul triggered pipelines are run (e.g. if: $CI_PIPELINE_SOURCE == 'api' && $ZUUL_BUILD)
    2. Sync the speculative repo state from the Zuul executor (/var/lib/zuul/builds) (or node /src) using rsync or git in either a before_script or hooks:pre_get_sources_script hook.
      • Note that if we use Git on the Zuul side and the pre_get_sources_script hook, we may be able to simply inject global Git insteadOf configuration which would save us redundant cloning operations and allow the GL jobs to function completely transparently with respect to the alternative Zuul Git source.
    3. Continue execution normally, and report ongoing and completion status to the associated MR.

Let's prove or disprove this concept.

Details

TitleReferenceAuthorSource BranchDest Branch
Implemented gitlab-pipeline GitLab delegation job in Zuulrepos/releng/gitlab-dev!1dduvallreview/zuul-to-gitlab-delegationmain
Customize query in GitLab

Event Timeline

dduvall changed the task status from Open to In Progress.Nov 1 2023, 4:56 PM
dduvall claimed this task.
dduvall triaged this task as Medium priority.
dduvall closed this task as Resolved.EditedNov 2 2023, 9:05 PM

I'd say this experiment was a success.

See https://gitlab.wikimedia.org/repos/releng/gitlab-dev/-/merge_requests/1 for implementation details, but here is a sequence diagram of the Zuul/GitLab interactions in this setup.

gitlab-zuul-delegation.png (601×803 px, 34 KB)

Some discoveries made during the experiment:

  • Delegating control back to GitLab by triggering a pipeline run via the API is viable.
  • Executing the playbook via a trusted job on the Zuul executor itself introduces minimal start-up latency on Zuul's end as a nodepool isn't needed. The total start-up latency between updating the MR in GitLab and creation of the GitLab pipeline by Zuul was about 5 seconds on my local machine.
  • GitLab reports the pipeline status via the merge request UI just as it normally would when creating a pipeline from a merge_request_event. This was unexpected and a big win.
  • The rsyncing of speculative state src files from the zuul-executor is simple and efficient, much faster than a git clone for each job. If we want to exclude the .git directory by default, this could be even faster. If we want to optimize further using shared volumes in environments where Zuul and GitLab runners are colocated, that also seems possible.
  • The normal git clone can be skipped in the GitLab pipeline jobs by specifying GIT_STRATEGY: none.

In short, this setup seems totally viable. We benefit from all the important Zuul gating features, and users still get self-serve CI through the canonical .gitlab-ci.yml files. We'll probably just want to package some of the common .gitlab-ci.yml configuration up in a shared project somewhere, but the amount of initialization on the GitLab project side is very minimal.

workflow:
  # Run pipelines for:
  #  1. Zuul triggered pipelines
  rules:
    - if: $CI_PIPELINE_SOURCE == 'api' && $ZUUL_BUILD
      variables:
        GIT_STRATEGY: none
        ZUUL_SRC_DIR: /src/build/${ZUUL_BUILD}

default:
  image: this/image/needs/to/have/rsync/installed
  before_script:
    - mkdir -p ${ZUUL_SRC_DIR}
    - rsync -rlvt --delete rsync://zuul-executor.local.wmftest.net/builds/${ZUUL_BUILD}/work/src/ ${ZUUL_SRC_DIR}/
    - cd ${ZUUL_SRC_DIR}

# everything from here on is standard config
stages:
  - test

test:
  stage: test
  script:
    - [...]