Page MenuHomePhabricator

allow to easily relate a published image tag with a wikibase-docker commit
Open, Needs TriagePublic

Description

wikibase-docker docker-compose.yml uses images published on hub.docker.com, which are sometimes lagging behind/not in sync with the master branch. This has for effect that they can be quite hard to debug for new comers as you could assume that running docker-compose -f docker-compose.yml when checking out the branch master would include all the behaviors and patches that were already merged into master. As this isn't the case, it could save everyone's time to find a way to make this asynchronicity explicit (so that containers users don't report bugs that were already patched for instance). Ideally, the way to make it explicit should allow to link a given image (typically tagged with latest) to a given commit hash.

cc @Addshore

Event Timeline

Maxlath created this task.Sat, Aug 24, 6:35 AM
Restricted Application added a project: Wikidata. · View Herald TranscriptSat, Aug 24, 6:35 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I guess this mainly applies to the wdqs frontend?

So, "RUN entries cannot modify environment variables".
Otherwise we could just do something like:

FOO=$(git rev-parse HEAD)

Also most of the Dockerfiles currently grab a zip from github rather than getting anything with the git history in it.

We could switch to git cloning for grabbing the code to allow us to check the hash, but that would increase build time and resources.

Maybe we shouldn't include anything in the images and just provide an example 1 liner for how you can (with pretty much 99.999% accuracy) figure out which commit you are using.

For example, for wdqs-frontend:

git rev-list -n 1 --first-parent --before=$(docker inspect -f '{{ .Created }}' wikibase/wdqs-frontend:latest) master

For things like the wikibase image this is a bit more complicated as multiple repos etc are included.

Thoughts @Maxlath, is this mainly an issue for images that are built from master right now like the wdqs-frontend?
Will that command do instead of including something in the image?

@Addshore it could apply to any image, not just wdqs-frontend:: when pulling wikibase/wikibase:1.33-bundle the name tells me it was built from the wikibase/1.33 folder, but that folder has been modified several times:

* 531fbda - (10 weeks ago) More cleanups for performance, use better names and passwords - Amir Sarabadani
* a294b27 - (10 weeks ago) Use a commit that works for 1.33 - Amir Sarabadani
* 7e8697d - (2 months ago) Explicitly download 1.34-wmf.8 of EntitySchema for 1.33 bundle - Amir Sarabadani
* 562e69b - (2 months ago) Add 1.33 + EntitySchema extension - Amir Sarabadani

How can I know at which commit it was built?

A solution could be to tag builds with the current commit hash:

TAG=$(git rev-parse HEAD | cut -c 1-7)
docker build -t wikibase:$TAG -t wikibase:latest

As for archives fetched within Dockerfiles, using git clone --depth=1 https://repo.url instead of downloading a zip file seems perfectly fine to me: that would download only the objects necessary for the last commit, making it super fast (actually faster that downloading+unzipping the archive in my test, but that was on a small repo with a good connection, maybe that's different for bigger repos), and making that last commit hash easy to identify with git log

Ahh, so you are also talking about wanting the hash of the Dockerfile / files that were used to build the image with.
AFAIK that isn't really a very standard way to go about tagging docker images.
This would mean for example, we we fix a typo in a readme, or change the formatting of a docker file users would have to look for a different tag. The list of tags would also be harder to navigate etc.

The way we have been doing this is that each version of each image will not have breaking changes made within it, so if you pull 1.33 for example, you should always be able to pull 1.33 without things breaking, you'll just get minor changes and security fixes.

When things are added or changed in a way that users of the images may need to know about right now we include that information pretty informally in the README.
See https://github.com/wmde/wikibase-docker/tree/master/wikibase where we introduced the last env var.

MW_WG_SECRET_KEY "secretkey" Used as source of entropy for persistent login/Oauth etc..(since 1.30)

If you do want to find the version of a dockerfile from the commit in wikibase-docker you can do the following:

This is the digest that you can use to pull that specific version / build of the image.