Background:
As a maintainer it is desirable to be able to run continuous integration tests against code changes to verify they do not break any expectations (tests) or generally the build (compilation/assembly).
It is also desirable to track upstream releases in a timely manner, minimising any exposure to security issues and taking advantage of performance improvements. To keep a low-overhead environment this can be achieved with tools such (e.g. Renovate, Dependency Bot, pipup), which either directly (auto merging) or in-directly (human review) rely on the tests to avoid breakage.
When using the builds service (build pack based image), there is no pre-built base image to execute tests in, something that was possible to achieve before my using e.g. docker-registry.tools.wmflabs.org/toolforge-php82-sssd-base in the CI runner (publicly accessible image).
Problem:
The target image can be built using the build pack tooling and then any relevant tests executed, generally this is nice as it offers flexibility regarding versions and a guarantee the "runtime" is almost identical (env vars etc aside).
Unfortunately using the upstream builder image in a way that is compatible with builds-api is problematic:
- The version of the builder image in Toolforge does not track the upstream releases (T380127)
- Runtime versions available upstream are not available in Toolforge (T408108, T401875, T363854 etc)
- The builder image has additional configuration (packs) applied within the TaskRun (https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/blob/main/deployment/chart/scripts/inject_buildpacks.sh), which complicates being able to just use tools-harbor.wmcloud.org/toolforge/heroku-builder directly
Using builds-api to generate the images is also problematic:
- Limited concurrent builds prevents any reasonable amount of scalability (https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/blob/main/deployment/chart/values.yaml?ref_type=heads#L43)
- Published assets cannot be removed, harbour quota is quite restrictive, preventing any reasonable amount of scalability (https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/blob/main/components/maintain-harbor/values/tools.yaml.gotmpl?ref_type=heads#L36)
- The builds-api has no good support for external usage (T332478, T363983)
- The builds are somewhat slow, given all the internal objects needing to be created and deleted
Which leads to the question of what is the current best practice for testing build pack based images?
Today there is https://wikitech.wikimedia.org/wiki/Help:Toolforge/Building_container_images#Testing_locally_(optional), however this does not cover the injected config/packs as outlined above.
A real world example of where this became problematic:
- Python 3.14 was released
- The .python-version release pin was updated by tooling to the current stable
- CI was happy, development was happy, deployments failed
- A human then had to spend time reviewing and down-grading 11 different repos