Page MenuHomePhabricator

Automatically build Toolforge infrastructure container images in GitLab
Closed, ResolvedPublic

Description

From #wikimedia-gitlab a few days ago:

15:31:59 <dcaro> Have we enabled/found out how to build images on gitlab-ci? I ask because I want to test https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/2 the same as the ops/alerts gerrit one, but that one uses blubber to build the image on the fly 
15:32:41 I'm interested also if it's possible to just build and push images (to our own harbor repository for toolforge)
15:36:11 <jelto> dcaro: you can use kokurri for building images with blubber files https://gitlab.wikimedia.org/repos/releng/kokkuri/-/blob/main/README.md#examples. RelEng build some abstraction for that use case
15:37:04 <dcaro> that's pretty cool
15:39:12 that solves one of the issues :)
15:39:12 For the other, can I build not blubber-based images too? (and push them to other repos like tools-harbor.wmcloud.org?)
15:40:32 <taavi> I don't see any issues with converting our Dockerfiles to blubber :P but yes, the other registry question is more important
15:41:17 <dcaro> that'd be interesting too yes
15:41:21 (the conversion I mean)

17:45:32 <dcaro> there's still the question on how to push to other repos (and maybe use docker instead of blubber, though that's not a big issue)
17:51:10 <dancy> Try something like this:
17:51:14 https://www.irccloud.com/pastebin/PllJI3mH/
17:54:38 You can also add PUBLISH_IMAGE_NAME to set the image path within the registry (defaults to the GitLab repo path)
17:55:33 <dcaro> How do I pass the credentials?
17:55:37 (awesome btw.)
17:56:29 <dancy> hmm.. good question..  we automatically set up JWT auth but you'll presumably need something else for the other registry..  Lemme dig.
17:57:57 Looks like we'll need to do some coding to support alternate auth mechanisms.  
18:12:10 Alternative:  Use the kokuri image, but run the following in the script section:
18:12:13 https://www.irccloud.com/pastebin/3nGzEEtI/
18:12:22 (untested)
18:12:40 Before that, add stuff to populate ~/.docker/config.json with auth info
18:15:08 <taavi> can/should we do that on the shared runners or should we set up our own runners for that?
18:15:17 <dancy> That will work on shared runners.
18:16:19 btw, when running buildctl manually, you can supply any frontend (e.g, the dockerfile frontend), so you're not locked to using blubber files.
18:17:52 To do that, use `--frontend=dockerfile.v0` (and exclude --opt source=...)
18:21:56 and point `--opt filename=` to the Dockerfile

Event Timeline

@taavi thanks a lot for pasting the chat!

Change 923547 had a related patch set uploaded (by David Caro; author: David Caro):

[operations/puppet@production] gitlab.runners: allow tools/toolsbeta harbor instances

https://gerrit.wikimedia.org/r/923547

Change 923547 merged by David Caro:

[operations/puppet@production] gitlab.runners: allow cloudvps public proxied serivces

https://gerrit.wikimedia.org/r/923547

Mentioned in SAL (#wikimedia-cloud-feed) [2023-06-01T08:57:36Z] <wm-bot2> deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-builds-api (6c6a27b) (T336130) - cookbook ran by dcaro@vulcanus

Mentioned in SAL (#wikimedia-cloud-feed) [2023-06-01T09:02:41Z] <wm-bot2> deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-builds-api (f1d94f7) (T336130) - cookbook ran by dcaro@vulcanus

Mentioned in SAL (#wikimedia-cloud-feed) [2023-06-01T09:11:32Z] <wm-bot2> deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-builds-api (0f4076a) (T336130) - cookbook ran by dcaro@vulcanus

Mentioned in SAL (#wikimedia-cloud-feed) [2023-06-01T09:21:36Z] <wm-bot2> deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-builds-api (0f4076a) (T336130) - cookbook ran by dcaro@vulcanus

I'm having some issues trying to clarify and automate the process laid out on https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/Toolforge_Kubernetes_component_workflow_improvements

The completely manual process is:

  • Merge a merge request -> build and push image with commit hash as tag
  • Create a new commit in the same repo changing the chart/values files -> get that merged and push a chart
    • One for each environment? (tools/toolsbeta)
    • What version number? (helm enforces semantic versions for charts)
    • Should we use the AppVersion too? (set to the commit hash of the image above? Or the other way around? see the last comment below)
  • Create a commit on the deployment repository bumping the chart version to that new one and changing any values that need adapting
    • Should we keep one helmfile per environment? Or do a template like this to handle different chart versions per environment?

Some of these steps could be automated, specially I think that the image and chart generation could be bundled together as they are dependent on each other (not always, but fore example the new image might need new config values set by the chart). But that can only be done if the future image tag is known before-hand, as the commit hash is only known after the merge.

Maybe we can set the version only on the chart (ex. as the AppVersion), and when generating the image use that as the tag?

After a chat in the toolforge workgroup meeting, was decided that the flow will be:

  • Create a merge request and bump the chart version with it -> build and push the image and the chart with a dev version
    • No AppVersion, just the version of the chart
  • Merge that merge request -> build and push the image and the chart with the given version to toolsbeta
  • Create a merge request on the deploy repo updating the chart version for the environment that you want

Notes from the meeting: https://docs.google.com/document/d/1MEuCtSu2AwPu_CLMV4el75gv_REEsspn4XJeeHYrNAU/edit#heading=h.7rfw586u5u38

Merge that merge request -> build and push the image and the chart with the given version to toolsbeta

If it's simpler we could use a vX.X.X git tag as the trigger.

I think we can close this :)

We currently have the toolforge-cd https://gitlab.wikimedia.org/-/ide/project/repos/cloud/cicd/gitlab-ci/tree/main/-/toolforge-cd/ flows that allow to just do that.

dcaro claimed this task.