Page MenuHomePhabricator

Failed docker build leaves dangling container
Closed, ResolvedPublic


On October 9th, I have manually removed 47 GBytes of dangling docker images (docker images prune -f). That cleaned 47GBytes of disk.

I suspect they are related to the service pipeline thing? Release Pipeline

Event Timeline

hashar created this task.Oct 16 2019, 5:17 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 16 2019, 5:17 PM
hashar triaged this task as Medium priority.Oct 16 2019, 5:19 PM
thcipriani renamed this task from contint1001 has lot of dangling Docker images to Release pipeline is creating/not cleaning intermediate dangling images.Apr 15 2020, 4:44 PM

The cause seems to be that the pipeline sometime leave some containers behind. I have pruned all images and containers on contint1001 a few days ago and right now I had one stopped container:

contint1001$ docker ps -a

CONTAINER ID        IMAGE               COMMAND                   CREATED             STATUS                    PORTS               NAMES
dc74b315089f        87764dba840c        "/bin/sh -c 'npm \"ru…"   42 hours ago        Exited (1) 42 hours ago                       elastic_mayer

That one had the command /bin/sh -c 'npm \"run-script\" \"build-all-portals\"'

Once I have got rid of it (docker container prune -f) that has let us get rid of all the dangling images (docker image prune -f).

So I guess the pipeline is missing a docker stop somewhere. The other jobs do it thanks to @dduvall patch 29ce25b9736e9f0faf01c6f08a132396ec73e376 (T198517: Quibble docker instance running on CI instance for 6 hours).

dancy added a subscriber: dancy.Jul 20 2020, 3:59 PM

Looking at contint1001 today (Mon 20 Jul 2020 03:47:28 PM UTC) I see:

dancy@contint1001:~$ docker ps -a
CONTAINER ID        IMAGE               COMMAND                   CREATED             STATUS                   PORTS               NAMES
2f928c43c0bc        770544497f3e        "/bin/sh -c 'npm \"ru…"   2 weeks ago         Exited (1) 2 weeks ago                       lucid_curie
b086bcd450f6        68cdfaa397a6        "/bin/sh -c 'npm ins…"    3 weeks ago         Exited (1) 3 weeks ago                       zealous_dijkstra
30917e88e8ee        f90cf678284d        "/bin/sh -c 'npm ins…"    4 weeks ago         Exited (1) 4 weeks ago                       sad_albattani
2d4effb4ec86        c7a86cc3aa6e        "/bin/sh -c 'go \"get…"   4 weeks ago         Exited (1) 4 weeks ago                       pedantic_feistel
2631dbfd74ba        fe03896877e6        "/bin/sh -c 'go \"get…"   4 weeks ago         Exited (1) 4 weeks ago                       stoic_pasteur
e06c982e07cc        53987eaf4ff3        "/bin/sh -c 'npm \"ru…"   6 weeks ago         Exited (1) 6 weeks ago                       modest_hofstadter
739caaac7bdd        02bd3a73827a        "/bin/sh -c 'npm \"ru…"   6 weeks ago         Exited (1) 6 weeks ago                       unruffled_meitner

Inspection of the images indicates that they were each part of a (probably interrupted) docker build operation:

dancy@contint1001:~$ for image in $(docker ps -a | grep -v IMAGE | awk '{print $2}'); do echo Image $image; docker inspect $image | jq .[0].Created ; docker inspect $image | jq .[0].ContainerConfig.Cmd; done
Image 770544497f3e
  "#(nop) COPY --chown=65533:65533dir:5ea207200e087d7318f8efff05d89bb7fa313c8a3a913863fedded6d597fae20 in ./ "
Image 68cdfaa397a6
  "#(nop) COPY --chown=65533:65533multi:bae03ba76a65f1269348b8451b412614914c919f41e18aa3a29ec4dab0746ce7 in ./ "
Image f90cf678284d
  "#(nop) COPY --chown=65533:65533file:87b7e8743cda1f9e03f10d86dbb248793a507acecb1150fc429c260d08c5436d in ./ "
Image c7a86cc3aa6e
  "#(nop) WORKDIR /srv/app"
Image fe03896877e6
  "#(nop) ",
  "ENV GOPATH=/usr/share/gocode"
Image 53987eaf4ff3
  "#(nop) COPY --chown=65533:65533dir:d96ca28d76f5004e0f04b000277fb659b1b16e2540211dab613331175a87bb54 in src/ "
Image 02bd3a73827a
  "#(nop) COPY --chown=65533:65533dir:ca88c1862f6b9285c26eae2ac8492cff5fc564fa98413d7795da94cc606601b9 in src/ "
dancy added a comment.Jul 20 2020, 7:50 PM

I have confirmed that docker build will leave a container around if a build step fails.

dancy added a comment.Jul 20 2020, 7:54 PM

From docker build docs:

   Always remove intermediate containers, even after unsuccessful builds. The default is false.
dancy claimed this task.Jul 20 2020, 7:55 PM
dancy updated the task description. (Show Details)

Change 614851 had a related patch set uploaded (by Ahmon Dancy; owner: Ahmon Dancy):
[integration/pipelinelib@master] Prevent container leak if docker build fails

I ran this today :

dancy@contint1001:~$ docker container prune
WARNING! This will remove all stopped containers.
Are you sure you want to continue? [y/N] y
Deleted Containers:

Total reclaimed space: 173.8MB
dancy renamed this task from Release pipeline is creating/not cleaning intermediate dangling images to Failed docker build leaves dangling container.Jul 21 2020, 9:04 PM
dancy added a project: User-dancy.
dancy moved this task from Backlog to Awaiting review/merge on the User-dancy board.

Change 614851 merged by jenkins-bot:
[integration/pipelinelib@master] Prevent container leak if docker build fails

dancy closed this task as Resolved.Jul 22 2020, 2:42 AM
dancy removed a project: User-dancy.

The shell one liner caught all dangling containers. This task was originally for images and indeed they are build. The pipeline builds roam on both contint1001 and contint2001. I pruned all dangling containers AND images on both hosts.

Well done Dancy!

contint1001$ docker image prune -f
Deleted Images:
Total reclaimed space: 36.68GB