Page MenuHomePhabricator

[builds-cli] No obvious way to delete individual `toolforge build` generated artifacts other than `toolforge clean`
Open, Stalled, MediumPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):
What I did:

Become the bridgebot tool; try to kick off a build (toolforge build start https://gitlab.wikimedia.org/toolforge-repos/bridgebot); get a quota error (“DENIED: adding 199.7 MiB of storage resource, which when updated to current usage of 938.6 MiB will exceed the configured upper limit of 1.0 GiB.”). Inspect the builds:

tools.bridgebot@tools-bastion-13:~$ toolforge build list                                                                                                                                                                                                                                              
build_id                                status    start_time            end_time              source_url                                                           ref                        envvars    destination_image                                                                            
bridgebot-buildpacks-pipelinerun-ghk8n  error     2024-06-24T19:03:25Z  2024-06-24T19:05:03Z  https://gitlab.wikimedia.org/toolforge-repos/bridgebot               N/A                        N/A        tools-harbor.wmcloud.org/tool-bridgebot/tool-bridgebot:latest                                
bridgebot-buildpacks-pipelinerun-z5npk  ok        2024-06-19T08:11:53Z  2024-06-19T08:14:10Z  https://gitlab.wikimedia.org/toolforge-repos/bridgebot               N/A                        N/A        tools-harbor.wmcloud.org/tool-bridgebot/tool-bridgebot:latest                                
bridgebot-buildpacks-pipelinerun-mqkrz  ok        2024-04-27T22:42:24Z  2024-04-27T22:43:46Z  https://gitlab.wikimedia.org/toolforge-repos/wikibugs2-znc           N/A                        N/A        tools-harbor.wmcloud.org/tool-bridgebot/znc:latest                                           
bridgebot-buildpacks-pipelinerun-k4ddk  ok        2024-04-27T22:29:48Z  2024-04-27T22:31:16Z  https://gitlab.wikimedia.org/toolforge-repos/wikibugs2-znc           N/A                        N/A        tools-harbor.wmcloud.org/tool-bridgebot/znc:latest                                           
bridgebot-buildpacks-pipelinerun-lvm2d  ok        2024-04-25T21:29:00Z  2024-04-25T21:30:54Z  https://gitlab.wikimedia.org/toolforge-repos/bridgebot               N/A                        N/A        tools-harbor.wmcloud.org/tool-bridgebot/tool-bridgebot:latest                                
bridgebot-buildpacks-pipelinerun-knkch  error     2024-04-24T17:57:03Z  2024-04-24T17:57:28Z  https://gitlab.wikimedia.org/toolforge-repos/bridgebot               N/A                        N/A        tools-harbor.wmcloud.org/tool-bridgebot/tool-bridgebot:latest                                
bridgebot-buildpacks-pipelinerun-bxnml  error     2024-04-16T03:40:59Z  2024-04-16T03:42:16Z  https://gitlab.wikimedia.org/toolforge-repos/bridgebot-matterbridge  work/bd808/try-some-hacks  N/A        tools-harbor.wmcloud.org/tool-bridgebot/tool-bridgebot:latest                                

Delete several builds, with several build attempts in between, until we’re down to just two builds:

tools.bridgebot@tools-bastion-13:~$ toolforge build list                                                                                                                                                                                                                                              
build_id                                status    start_time            end_time              source_url                                                  ref    envvars    destination_image                                                                                                         
bridgebot-buildpacks-pipelinerun-z5npk  ok        2024-06-19T08:11:53Z  2024-06-19T08:14:10Z  https://gitlab.wikimedia.org/toolforge-repos/bridgebot      N/A    N/A        tools-harbor.wmcloud.org/tool-bridgebot/tool-bridgebot:latest                                                             
bridgebot-buildpacks-pipelinerun-mqkrz  ok        2024-04-27T22:42:24Z  2024-04-27T22:43:46Z  https://gitlab.wikimedia.org/toolforge-repos/wikibugs2-znc  N/A    N/A        tools-harbor.wmcloud.org/tool-bridgebot/znc:latest                                                                        

What happens?:
The quota is still reported as basically full:

tools.bridgebot@tools-bastion-13:~$ toolforge build quota                                                                                           
Registry                                                                                                                                                                                                                                                                                              
===================                                                                                                                                                                                                                                                                                   
Storage                                                                                                                                                                                                                                                                                               
-----------                                                                                                                                                                                                                                                                                           
Available    85.17Mi                                                                                                                               
Capacity     92%                                                                                                                                   
Limit        1.00Gi                                                                                                                                                                                                                                                                                   Used         938.83Mi                                                                                                                                                                                                                                                                                 

And new builds still fail to store the resulting image.

What should have happened instead?:
Given that I deleted most builds, I would expect the tool to have enough free space for at least one new build.

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):
Toolforge Builds CLI, version 0.0.16

Other information (browser name/version, screenshots, etc.):
Unfortunately I didn’t check the quota before the first build, but I still have the terminal scrollback, and it seems notable that it doesn’t have a message about the quota being near the limit:

tools.bridgebot@tools-bastion-13:~$ toolforge build start https://gitlab.wikimedia.org/toolforge-repos/bridgebot                                                                                                                                                                                      
Waiting for the logs... if the build just started this might take a minute                                                                                                                                                                                                                            
[place-tools] 2024-06-24T19:03:28.582717254Z 2024/06/24 19:03:28 Copied /ko-app/entrypoint to /tekton/bin/entrypoint                                                                                                                                                                                  
[step-init] 2024-06-24T19:03:29.428446916Z 2024/06/24 19:03:29 Setup /step directories                                                                                                                                                                                                                

Whereas later builds, starting with the next one already (note that this was immediately after I deleted four builds), did have that message:

tools.bridgebot@tools-bastion-13:~$ toolforge build start https://gitlab.wikimedia.org/toolforge-repos/bridgebot                                                                                                                                                                                      
Warning: Tool bridgebot has used up 92% of it's alloted quota.To avoid the possibility of your build failing, run "toolforge build clean" to free up quota.                                                                                                                                           
Waiting for the logs... if the build just started this might take a minute                                                                                                                                                                                                                            
[place-tools] 2024-06-24T19:06:44.836099757Z 2024/06/24 19:06:44 Copied /ko-app/entrypoint to /tekton/bin/entrypoint                                                                                                                                                                                  
[step-init] 2024-06-24T19:06:45.694553003Z 2024/06/24 19:06:45 Setup /step directories                                                                                                                                                                                                                

Event Timeline

bd808 changed the task status from Open to In Progress.Jul 5 2024, 5:16 PM
bd808 claimed this task.
bd808 triaged this task as High priority.
bd808 subscribed.

Let's start by deleting the unused build of https://gitlab.wikimedia.org/toolforge-repos/wikibugs2-znc:

$ toolforge build delete "bridgebot-buildpacks-pipelinerun-mqkrz"
I'm going to delete 1 builds, continue? [y/N]: y
Deleted 1 builds
$ toolforge build quota
Registry
===================
Storage
-----------
Available    140.65Mi
Capacity     86%
Limit        1.00Gi
Used         883.35Mi

We may as well drop the failed build too while we are pruning cruft:

$ toolforge build delete "bridgebot-buildpacks-pipelinerun-9t7s9"
I'm going to delete 1 builds, continue? [y/N]: y
Deleted 1 builds
$ toolforge build quota
Registry
===================
Storage
-----------
Available    140.65Mi
Capacity     86%
Limit        1.00Gi
Used         883.35Mi

Building to an alternate image name out of an abundance of caution found a weird bug:

$ toolforge build start https://gitlab.wikimedia.org/toolforge-repos/bridgebot --image-name T368317
...

[step-detect] 2024-07-05T17:21:49.204212139Z 2 of 4 buildpacks participating
[step-detect] 2024-07-05T17:21:49.204269057Z heroku/go       0.1.13
[step-detect] 2024-07-05T17:21:49.204282595Z heroku/procfile 2.0.2

...

[step-analyze] 2024-07-05T17:21:49.394848683Z ERROR: failed to resolve inputs: could not parse reference: tools-harbor.wmcloud.org/tool-bridgebot/T368317:latest

...

[step-export] 2024-07-05T17:21:51.613114564Z 2024/07/05 17:21:51 Skipping step because a previous step failed
[step-results] 2024-07-05T17:21:51.904497795Z 2024/07/05 17:21:51 Skipping step because a previous step failed

That could not parse reference error seems to be coming from the heroku/go v0.1.13 buildpack?

The error seems to be triggered by the --image-name value. Using --image-name bb works with this non-fatal error output:

[step-analyze] 2024-07-05T17:26:27.567441490Z Image with name "tools-harbor.wmcloud.org/tool-bridgebot/bb:latest" not found

We still don't have enough quota space however at least when not 100% replacing the existing image:

$ toolforge build start https://gitlab.wikimedia.org/toolforge-repos/bridgebot --image-name bb
...
[step-export] 2024-07-05T17:27:38.194576230Z       tools-harbor.wmcloud.org/tool-bridgebot/bb:latest - PUT https://tools-harbor.wmcloud.org/v2/tool-bridgebot/bb/blobs/uploads/bfa5a9bc-f983-4e0b-adcf-7386790a1a7e?_state=REDACTED&digest=sha256%3A24d6ab1bfee881c131468dac2e25654c6d469d3116727286e14e46f639fe0867: DENIED: adding 201.8 MiB of storage resource, which when updated to current usage of 938.6 MiB will exceed the configured upper limit of 1.0 GiB.
[step-export] 2024-07-05T17:27:38.218685307Z ERROR: failed to export: failed to write image to the following tags: [tools-harbor.wmcloud.org/tool-bridgebot/bb:latest: PUT https://tools-harbor.wmcloud.org/v2/tool-bridgebot/bb/blobs/uploads/bfa5a9bc-f983-4e0b-adcf-7386790a1a7e?_state=REDACTED&digest=sha256%3A24d6ab1bfee881c131468dac2e25654c6d469d3116727286e14e46f639fe0867: DENIED: adding 201.8 MiB of storage resource, which when updated to current usage of 938.6 MiB will exceed the configured upper limit of 1.0 GiB.]
[step-export] 2024-07-05T17:27:38.231931293Z |=================================================================|
[step-export] 2024-07-05T17:27:38.232126499Z | The above error is likely because the build is bigger than your |
[step-export] 2024-07-05T17:27:38.232327845Z | current available quota.                                        |
[step-export] 2024-07-05T17:27:38.232350096Z | To check your available quota run: "toolforge build quota".     |
[step-export] 2024-07-05T17:27:38.232361897Z | To free up space you can run: "toolforge build clean".          |
[step-export] 2024-07-05T17:27:38.232427780Z | If the error persists,                                          |
[step-export] 2024-07-05T17:27:38.232486870Z | please report to the Toolforge admins: https://w.wiki/6Zuu      |
[step-export] 2024-07-05T17:27:38.232498284Z |=================================================================|
[step-results] 2024-07-05T17:27:38.375300088Z 2024/07/05 17:27:38 Skipping step because a previous step failed
$ toolforge build quota
Registry
===================
Storage
-----------
Available    140.65Mi
Capacity     86%
Limit        1.00Gi
Used         883.35Mi

$ toolforge build clean
NOTE: This will remove all your built images to clean up space, you'll have to start a new build to create a new image before being able to restart any running webservice/job or start a new one, are you sure? [y/N]: y
Result of the cleanup: Deleted 1 artifacts from harbor repository tool-bridgebot/znc
Deleted 1 artifacts from harbor repository tool-bridgebot/matterbridge
Deleted 1 artifacts from harbor repository tool-bridgebot/tool-bridgebot

If you still need more space, please contact a toolforge maintainer (https://w.wiki/6Zuu).
$ toolforge build quota
Registry
===================
Storage
-----------
Available    1.00Gi
Capacity     0%
Limit        1.00Gi
Used         0B
$ toolforge build start https://gitlab.wikimedia.org/toolforge-repos/bridgebot
...
[step-results] 2024-07-05T20:40:23.873169563Z Built image tools-harbor.wmcloud.org/tool-bridgebot/tool-bridgebot:latest@sha256:161b7efa8225d46e41fca1fae7e8d91100c57364979f0f72addb33ac905f95be
$ toolforge build quota
Registry
===================
Storage
-----------
Available    765.82Mi
Capacity     25%
Limit        1.00Gi
Used         258.18Mi

Running toolforge build clean apparently gets rid of things that you can't see with toolforge build list or by extension delete with toolforge build delete.

bd808 renamed this task from bridgebot tool build service quota not going down to [builds-cli] No obvious way to delete individual `toolforge build` generated artifacts other than `toolforge clean`.Jul 5 2024, 8:47 PM
bd808 removed bd808 as the assignee of this task.
bd808 lowered the priority of this task from High to Medium.
bd808 removed a project: User-bd808.

Mentioned in SAL (#wikimedia-cloud) [2024-07-05T20:55:40Z] <wmbot~bd808@tools-bastion-12> Restarted bridgebot job to pick up new container build (T368317, MR!8)

This is half-intentional, in the sense that we decided to avoid exposing the concept of 'images' to users, so there's no interface to 'manage images' as such, similar to heroku.

There's the point though that a build is not the same as an image, several builds might end up building the same 'image' (same image identifier), so deleting a build should not delete an image that another build also pushed, so things get tricky there.

In summary, is not that there's no obvious way to delete a single artifact, there is just no way of doing it by design. That will be improved once we introduce 'components' and such, where then you can address components by name, instead of --image-name, something like toolforge build clean --component znc and will cleanup anything related to that component.

In summary, for now the only way to free quota for the build service, is using toolforge build clean, and that will remove all the images you have. There's a mid-term plan on enable that per-component, but will take a bit. If this is blocking or critical, we can try to find a workaround before that, so let us know if so.

This is half-intentional, in the sense that we decided to avoid exposing the concept of 'images' to users, so there's no interface to 'manage images' as such, similar to heroku.

The --image-name cli arg does seem to expose the concept of images directly to the user though. It seems that you chose to expose the "build" container and logs as the REST'ful noun that are the remnants of the "build" action. It feels like it would be more intuitive to the end user to expose "image" as the noun instead: toolforge image build ..., toolforge image list, toolforge image delete ...`.

If this is blocking or critical, we can try to find a workaround before that, so let us know if so.

At this point it seems likely that this is only a power user or experimental learner problem. The problems is most likely to manifest when --image-name has been used to store more than one image in the tool's quota restricted registry namespace.

This is half-intentional, in the sense that we decided to avoid exposing the concept of 'images' to users, so there's no interface to 'manage images' as such, similar to heroku.

The --image-name cli arg does seem to expose the concept of images directly to the user though. It seems that you chose to expose the "build" container and logs as the REST'ful noun that are the remnants of the "build" action. It feels like it would be more intuitive to the end user to expose "image" as the noun instead: toolforge image build ..., toolforge image list, toolforge image delete ...`.

Yep, that option was a leak from the early phase, when we allowed to pass most tekton task options through the cli as we were investigating, it was not removed because it would enable this "temporary™" hack for people that wanted more than one component before we actually support components natively, though people starting to rely on it and the possible confusion of half-exposing an interface was considered.

If this is blocking or critical, we can try to find a workaround before that, so let us know if so.

At this point it seems likely that this is only a power user or experimental learner problem. The problems is most likely to manifest when --image-name has been used to store more than one image in the tool's quota restricted registry namespace.

👍, will focus on getting the components first then.

fnegri changed the task status from In Progress to Stalled.Jan 5 2026, 6:41 PM
fnegri added a project: cloud-services-team.