Page MenuHomePhabricator

Kokkuri build failure on airflow-dags repo
Closed, ResolvedPublic

Description

Example failed pipelines:
https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/jobs/642619
https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/jobs/642180

Note these pipelines were working ~24 hours ago.

Example failure logs copied from https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/jobs/642619:

$ kokkuri image build
2025-10-10 16:26:49,803 Valid syntax line found in docker/blubber.yaml
2025-10-10 16:26:49,803 Using build frontend docker-registry.wikimedia.org/repos/releng/blubber/buildkit:v1.4.1
2025-10-10 16:26:49,803 Configuring Docker client to use JWT auth for registry.cloud.releng.team
2025-10-10 16:26:49,804 Registry based caching available at /cache
error: failed to solve: failed to configure registry cache exporter: invalid reference format
2025-10-10 16:26:52,834 Command '['buildctl', '--timeout', '3600', '--wait', 'build', '--progress=plain', '--frontend=gateway.v0', '--opt', 'source=docker-registry.wikimedia.org/repos/releng/blubber/buildkit:v1.4.1', '--opt', 'filename=docker/blubber.yaml', '--opt', 'target=build-test-image', '--metadata-file', '/tmp/tmp1_p8lxuu', '--local', 'context=.', '--local', 'dockerfile=.', '--opt', 'platform=linux/amd64', '--import-cache', 'type=registry,ref=/cache/build-test-image:main', '--import-cache', 'type=registry,ref=/cache/build-test-image:bump-to-pickup-cast-fix', '--export-cache', 'type=registry,ref=/cache/build-test-image:bump-to-pickup-cast-fix']' returned non-zero exit status 1.

Note this blocks any airflow-dags fixes from going to production.

Details

Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
includes/kokkuri.yaml: bump KOKKURI_IMAGE to 2.10.4repos/releng/kokkuri!152dancymain-I784b7f478e04a8601204de75faa27a4440de5f75main
build: peg kokkuri to 2.10.4repos/data-engineering/airflow-dags!1746xcollazofix-kokkuri-to-2_10_1main
includes/kokkuri.yaml: Add rules to kokkuri:setup-variables jobrepos/releng/kokkuri!151dancymain-I76324e3e28d32185ca7aea57898b18486b65ce17main
build: add a .pre stage to fix kokkuri.repos/data-engineering/airflow-dags!1741xcollazofix-build-2main
Customize query in GitLab

Event Timeline

Please try adding .pre to the top of the list of stages in your .gitlab-ci.yml file.

There were changes made to kokkuri recently that make it so that the kokkuri:setup-variables job (which is defined to be in the .pre stage) must be executed.

@xcollazo I made a change to Kokkuri. Please restart your pipeline and lemme know how it goes.

A question: there was no bump of CI attributes from our side when this broke on a Friday. Can we add a feature so that kokkuri, from a CI standpoint, is versioned, and thus downstream users do not get new features unless we explicitly bump to a newer version?

We have done this to a set of CI utilities that we call workflow_utils, and it allows downstream users to upgrade at their own pace:

# Include conda_artifact_repo.yml to add release and conda env publishing jobs.
include:
  - project: 'repos/data-engineering/workflow_utils'
    ref: v0.20.0
    file: '/gitlab_ci_templates/pipelines/conda_artifact_repo.yml'

In the example above, the downstream project must bump ref to get any breaking changes. A similar mechanism on kokkuri would be great.

A question: there was no bump of CI attributes from our side when this broke on a Friday. Can we add a feature so that kokkuri, from a CI standpoint, is versioned, and thus downstream users do not get new features unless we explicitly bump to a newer version?

Yes, you can do something like this:

include:
  - component: 'gitlab.wikimedia.org/repos/releng/kokkuri/images@2.9.0'

Btw the way, the current version of kokkuri is 2.10.1.

Great, will change our CI to adopt 2.10.1.

CC @amastilovic

Btw the way, the current version of kokkuri is 2.10.1.

Looks like 2.10.1 doesn't include the kokkuri!151 fix. Can we do a new kokkuri release?

Oh TIL that we can use kokkuri as a GitLab component! Thumbs up.

Btw the way, the current version of kokkuri is 2.10.1.

Looks like 2.10.1 doesn't include the kokkuri!151 fix. Can we do a new kokkuri release?

Ah yes. In progress...

Release 2.10.3 has been prepared.

Release 2.10.3 has been prepared.

Maybe not. Still fumbling with it...

OK. Kokkuri 2.10.4 should be working now. Lemme know how it goes.

OK. Kokkuri 2.10.4 should be working now. Lemme know how it goes.

Build succeeded. Thanks for taking care of this @dancy!

xcollazo assigned this task to dancy.