User Details
- User Since
- Nov 6 2025, 12:01 PM (10 w, 2 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- This.chris.corriere [ Global Accounts ]
Wed, Jan 14
there are only 3 repos left to be synced.
Tue, Jan 13
Thu, Jan 8
we're exploring some trade-offs between the gitlab/cdtools approach from my initial PoC with a python library. I tested the library some this week and stopped at an authentication issue around PATs for git access.
My next steps are to document the trade-offs in the design doc Renil started, then debug the PAT auth issues to complete the alternate PoC if the team decides to A/B test them.
Mon, Dec 22
eks runners are active in dev. I'll create an additional ticket for some minor patching after the break
merged to dev
Dec 16 2025
have initial 1.0.0 tagged in the test ECR repo. Will continue testing tomorrow.
Dec 12 2025
all services have MRs into dev with passing pipeline tests from the dev eks runners. Admin is waiting on an MR into cdtools before it can be merged to dev.
all three apis have MRs into dev with passing pipeline tests from the dev eks runners.
Dec 11 2025
- dags is merged to main
- eventstream-listener and snapshots have MRs with passing pipelines for dev
- commons failed on a linting issue introduced through general/wmf
Dec 10 2025
per slack convo: The scope here is to ensure the dev and the main branch builds have semantic versioning tags and the CD environment is able to pull the tagged images automatically via argocd. Currently, we pull only latest tags. Stretched goal for this epic will also be looking into generating automated changelogs
tested single image build/push with content-integrity and multi image build/push with commons, per migration plan.
Permissions on ci/cd creds restricted which ECR repos could be pushed to, so created a new one for pipeline_test.
the latest_tag added for the multi-handler build should help with the semver ticket
Dec 9 2025
Are we pushing to snapshot while in non-prod?
Dec 8 2025
After syncing with Jose we're focusing on e2e-framework since it leverages KinD and should be compatible with the new runners.
Dec 6 2025
these three apis are prepped for migration as soon as cdtools has been merged.
all services are prepped for migration as soon as cdtools is merged.
tentative plan to discuss:
Trunk-based development with ephemeral envs for MRs and ci-* branches
Final State:
- main to pr (production only)
- MRs to ephemeral environments
- dv removed after migration (replaced by ephemeral)
- Naming: mr-123, ci-feature-x patterns for namespaces, URLs
- Infrastructure: Single service, namespace isolation, shared dev infrastructure with prefixes
- Lifecycle: Auto-create/destroy with grace periods
admin service is a safe proof of concept candidate for semver. It's relatively small so low risk for an initial example. I can test this in a project like pipeline-test or use pipeline-test for the PoC if it's more convenient.
Dec 4 2025
I was able to run a dags-check on both dv and pr eks runners in the pipeline-test project
dags, commons, on-demand, and content-integrity have been tested on the new runners. I can start submitting MRs to start the review processes.
Dec 3 2025
per slack conversation this will likely need to be rescoped to include internal dependencies like schema. If an enterprise image needs to be rebuilt based on a semver tag it can be "officially validated" unless the dependencies are pinned to a version as well and included in an SBOM.
Dec 2 2025
moving more config to cdtools in an effort to make switching from ECS to EKS runners as clean as possible.
initial MR for eks runner app templates has been submitted
working to get an EKS gitlab runner deployed to production.
Remaining steps to deploy the runner to prod:
I was asked to hold off on this ticket until further details on an ephemeral testing strategy have been provided.
Dec 1 2025
test-dags-import pipeline succeeded on MR in pipeline-test.
Notes from chat:
- Finalize gitlab runner configuration to handle the concurrency required by the existing pipeline builds happening on ECS fargate.
A few other checks such as failed task cleanup and garbage dump of the pods should work as expected. Memory spec should be pipeline specific instead of changing runner configuration.
- Deploy the runner for production
- Start migrating pipelines for existing services first, APIs later
dags
content-integrity
commons
Nov 26 2025
realtime and on-demand docker compose tests are completing on the new runners. Will double check the tests before I close the ticket
Nov 24 2025
I've started debugging on on-demand integration tests locally.
pipeline-test is successfully building images on the new runner for main, realtime, and on-demand
Nov 18 2025
The ci pipeline in the test app repository can now schedule jobs on the newly provisioned runner. It's still a basic docker job, so I'll be moving into docker compose and building/pushing images first thing tomorrow.
Nov 17 2025
merged MR to fix the polling request warning and moved the external secret to the provided chart level resource. Moving to CI test to confirm the alpine runner pod kicks off an ubuntu job pod.
I was able to test the DinD configuration locally with docker. The MR was approved by Renil this morning after I pushed my changes. We'll need to see if the gitlab runner registers successfully before we can fully test DinD. The k8s/istio upgrade is a higher priority and these changes probably don't need to be merged until the upgrade is complete.
Nov 12 2025
Spoke with Renil about splitting this story between runner registration and docker-in-docker (DinD), given DinD has more configuration options and additional security concerns
