Page MenuHomePhabricator

Pipeline image build cleanup
Open, NormalPublic

Description

As we start doing builds we'll need a way to clean-up docker image layers that are no-longer needed.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 10 2017, 6:41 PM

It looks like Docker now has various prune commands that should be useful for this case:

$ docker image prune -a

WARNING! This will remove all images without at least one container associated to them.
$ docker system prune -a

WARNING! This will remove:
	- all stopped containers
	- all volumes not used by at least one container
	- all networks not used by at least one container
	- all images without at least one container associated to them

By default, these commands will only delete "dangling" images which will not include tagged images such as the ones resulting from the build phase of the pipeline. I think using the -a option should include tagged images as well but we'd want to make sure it's not too aggressive and blows away cached based images on a regular basis (or maybe we don't care depending on how often this runs?). We can also use --filter 'until=' with a timestamp or duration (e.g. 8h) to limit prunes to only older images though I'm not sure how this interacts with -a.

By default, these commands will only delete "dangling" images which will not include tagged images such as the ones resulting from the build phase of the pipeline. I think using the -a option should include tagged images as well but we'd want to make sure it's not too aggressive and blows away cached based images on a regular basis (or maybe we don't care depending on how often this runs?).

One thing to be mindful of is that hosts where we are running blubber built test images may not necessarily be exclusively used for those images. Depending on how we setup the test portion of the pipeline, we may be sharing a host used for other docker CI stuff and we wouldn't want to blow away those images. Excluding images tagged wmfreleng/* would probably be enough to avoid clobbering those.

By default, these commands will only delete "dangling" images which will not include tagged images such as the ones resulting from the build phase of the pipeline. I think using the -a option should include tagged images as well but we'd want to make sure it's not too aggressive and blows away cached based images on a regular basis (or maybe we don't care depending on how often this runs?).

One thing to be mindful of is that hosts where we are running blubber built test images may not necessarily be exclusively used for those images. Depending on how we setup the test portion of the pipeline, we may be sharing a host used for other docker CI stuff and we wouldn't want to blow away those images. Excluding images tagged wmfreleng/* would probably be enough to avoid clobbering those.

Definitely. Anything tagged won't be deleted by docker image prune or docker system prune unless we give it the -a option.

After our chat yesterday in IRC, it seemed reasonable to do something like:

  1. Iterate over images that were built and tagged by the pipeline script or maybe just Blubber in general—we could have Blubber add some useful labels by default
  2. Remove all tags from each image if it's older than our cleanup threshold
  3. Run docker image prune --filter 'until=[threshold]' and let it delete the newly untagged images as well as other dangling images
dduvall triaged this task as Normal priority.
dduvall moved this task from Backlog to CI on the Release Pipeline board.Nov 6 2017, 6:17 PM
thcipriani removed thcipriani as the assignee of this task.

Not currently working on this, but may pick it up again in near future.

The other day I used this nasty little Ruby one-liner to cleanup docker-pkg images that weren't latest. It wasn't perfect (Docker complained about trying to delete some images that were parents of others) but it's a start.

docker image ls --format "{{.ID}} {{.Tag}}" | ruby -e 'images = {}; while gets; f = $_.split; (images[f[0]] ||= []) << f[1]; end; puts images.reduce([]) { |r, (k, v)| v.include?("latest") ? r : r << k }' | xargs docker rmi

The other day I used this nasty little Ruby one-liner to cleanup docker-pkg images that weren't latest. It wasn't perfect (Docker complained about trying to delete some images that were parents of others) but it's a start.

docker image ls --format "{{.ID}} {{.Tag}}" | ruby -e 'images = {}; while gets; f = $_.split; (images[f[0]] ||= []) << f[1]; end; puts images.reduce([]) { |r, (k, v)| v.include?("latest") ? r : r << k }' | xargs docker rmi

This makes sense. docker-pkg should only look for the images that are present in the docker changlog files for the integration/config repository, so any image tagged with :latest should be the most recent version of the image. Images that are older, while they may still be referenced in jjb and in jenkins, can still be removed from contint1001 without impacting docker-pkg runs since they have all been pushed to the docker registry.

Blubber is more dependent on the docker cache, but it won't be for production images only for test images. For test images we could probably remove blubber images that are older than n where n ~= 2 weeks.

dduvall claimed this task.Feb 13 2019, 6:58 PM

Change 490505 had a related patch set uploaded (by Dduvall; owner: Dduvall):
[integration/config@master] maintenance: Cleanup old Docker images at a lower threshold

https://gerrit.wikimedia.org/r/490505

maybe possibly helpful? docker tag tool: https://github.com/gofunky/tuplip