Page MenuHomePhabricator

Self-build and publish buildkit helper images
Open, MediumPublic

Description

buildkit uses internal helper images to execute the image builds. According to discussion in T320730 there is only docker/dockerfile-copy image in use.

This image comes from Docker Inc and dockerhub currently. So there is a risk of not getting updates for that image and beeping dependent on the update policy of Docker Inc. Last build/update for that image was 4 years ago, so it's quite likely it's not getting a lot of attention currently and in the future.

To make us less dependent on public images and to be able to update this image we should try to build dockerfile-copy ourselves and move it to our registry.

First step is to find the actual repo/build instructions/Dockerfile for that image.

https://hub.docker.com/r/docker/dockerfile-copy

Event Timeline

Using wagooman/dive I extracted the following info:

  • It starts from a base alpine (sigh...) image
  • RUN /bin/sh -c apk add --no-cache tar gzip bzip2 xz
  • Copies the "copy" binary in the container:

COPY /copy /bin/ # buildkit

(command line: docker run -ti --rm -v /var/run/docker.sock:/var/run/docker.sock wagoodman/dive docker/dockerfile-copy

now we should take a look at that "copy" file and see what it is:

$ docker run --name=copyfrom --rm -ti --entrypoint /bin/sh docker/dockerfile-copy
$ docker cp copyfrom:/bin/copy .
$ file copy 
copy: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, no section header

So it's a x86-64 binary statically linked that we can just import as-is if we want to.

Jelto triaged this task as Medium priority.Nov 1 2022, 11:35 AM
Jelto added a subscriber: dduvall.

Thanks @Joe for the analysis! That pushed my in the right direction.

I also found the code/project for the copy binary here: https://github.com/tonistiigi/copy . There is also a Dockerfile in there.

I'll try to migrate that to a GitLab project. Best case with a job to build the copy binary, otherwise we can import it. First I have to verify if that's the correct code repository and what the copy tool is doing.

@dduvall adding you here for awareness :)

Thanks @Joe for the analysis! That pushed my in the right direction.

I also found the code/project for the copy binary here: https://github.com/tonistiigi/copy . There is also a Dockerfile in there.

I'll try to migrate that to a GitLab project. Best case with a job to build the copy binary, otherwise we can import it. First I have to verify if that's the correct code repository and what the copy tool is doing.

The copy image is used to provide semantics for ADD/COPY Dockerfile instructions (those instructions do more than simple cp, for example, remote fetching, unarchiving, chown operations, etc.). The reason it's in use for Blubber is that Blubber is not yet buildkit-native—it still transcodes to Dockerfile syntax before passing that through dockerfile2llb. Once we refactor it to use native LLB constructs following blubberoid and CLI deprecation, it will no longer have a dependency on dockerfile/copy nor other Dockerfile frontend dependencies.

@dduvall adding you here for awareness :)

Thanks! That sounds like a good plan. We can maintain the fork under repos/releng much like we're doing for buildkit. If you want, I can write the CI file and scripts to get it building and published to our registry.

The copy image is used to provide semantics for ADD/COPY Dockerfile instructions (those instructions do more than simple cp, for example, remote fetching, unarchiving, chown operations, etc.). The reason it's in use for Blubber is that Blubber is not yet buildkit-native—it still transcodes to Dockerfile syntax before passing that through dockerfile2llb. Once we refactor it to use native LLB constructs following blubberoid and CLI deprecation, it will no longer have a dependency on dockerfile/copy nor other Dockerfile frontend dependencies.

@dduvall adding you here for awareness :)

Thanks! That sounds like a good plan. We can maintain the fork under repos/releng much like we're doing for buildkit. If you want, I can write the CI file and scripts to get it building and published to our registry.

Thanks a lot for the feedback and help with CI!

I created repos/releng/dockerfile-copy and created a MR to move all of the code from https://github.com/tonistiigi/copy there. I'm not sure if that's a reasonable first step. I have some concerns of forking such a old and low-traffic project (2 starts, last commit 4 years ago). But the actual binary in the dockerfile/copy image points to this GitHub repo:

./copy 
panic: invalid args []

goroutine 1 [running]:
main.main()
        /go/src/github.com/tonistiigi/copy/cmd/copy/main.go:57 +0x2af

We can also start by vendoring the copy go binary and start from there. What do you think?

The pipeline in the existing GitHub project uses buildkit too. So we may be able to adapt the travis-ci buildkit commands to GitLab.

In the last IC sync meeting we discussed that it makes more sense to add the self-hosted dockerfile/copy image to production-images first instead of hosting that separately in GitLab.

This may be blocked by T322453: Buildkit erroring with "cannot reuse body, request must be retried" upon multi-platform push which is preventing the publishing of multi-platform images. The dockerfile-copy image needs to be multi-platform to achieve parity with the upstream version.

See platforms listed here.

In the last IC sync meeting we discussed that it makes more sense to add the self-hosted dockerfile/copy image to production-images first instead of hosting that separately in GitLab.

I’m second guessing this after discovering that the image is multi-platform and AFAIK docker-pkg cannot produce multi-platform images. Blubber can, however, but then see my previous comment about the error when pushing a multi-platform to our registry. We might need to check in about some of this next week. :)