Page MenuHomePhabricator

[builds-service] builds not working due to access issues in tools
Closed, ResolvedPublic

Description

tools.wm-lol@tools-bastion-13:~$ toolforge build start https://gitlab.wikimedia.org/toolforge-repos/sample-static-buildpack-app
...

[step-analyze] 2025-08-26T13:04:15.916827667Z ERROR: failed to initialize analyzer: validating registry write access: ensure registry read/write access to tools-harbor.wmcloud.org/tool-wm-lol/tool-wm-lol:latest

Event Timeline

dcaro changed the task status from Open to In Progress.Aug 26 2025, 1:06 PM
dcaro triaged this task as High priority.

Trying to pull + push a single image from a tool repo using the robot account ends in 500 error:

dcaro@acme$ podman login https://tools-harbor.wmcloud.org
Username: ****************
Password: 
Login Succeeded!

dcaro@acme$ podman pull tools-harbor.wmcloud.org/tool-wm-lol/tool-wm-lol:latest
Trying to pull tools-harbor.wmcloud.org/tool-wm-lol/tool-wm-lol:latest...
Getting image source signatures
Copying blob 62aa935efb08 done   | 
Copying blob 01007420e9b0 done   | 
Copying blob df2d2178126f done   | 
Copying blob 4376f68108d7 done   | 
Copying blob aad43203853c done   | 
Copying blob 7443f334f93e done   | 
Copying blob 311f7a406a90 done   | 
Copying blob 8121ce349454 skipped: already exists  
Copying blob 27bc2b3efec7 skipped: already exists  
Copying blob 7278bf0dcce7 done   | 
Copying blob 7dd4876b9485 done   | 
Copying blob 76c7a0f0f628 done   | 
Copying blob 5a44e4f7b58d skipped: already exists  
Copying config 8d6835ed79 done   | 
Writing manifest to image destination
8d6835ed7907db950e394c1ddb30e3266db964e21736f08b7f616e7be2041c47

dcaro@acme$ podman push tools-harbor.wmcloud.org/tool-wm-lol/tool-wm-lol:latest
Getting image source signatures
Copying blob 4376f68108d7 skipped: already exists  
Copying blob aad43203853c skipped: already exists  
Copying blob df2d2178126f skipped: already exists  
Copying blob 62aa935efb08 skipped: already exists  
Copying blob 7443f334f93e skipped: already exists  
Copying blob 311f7a406a90 skipped: already exists  
Copying blob 8121ce349454 skipped: already exists  
Copying blob 7278bf0dcce7 skipped: already exists  
Copying blob 27bc2b3efec7 skipped: already exists  
Copying blob 01007420e9b0 skipped: already exists  
Copying blob 76c7a0f0f628 skipped: already exists  
Copying blob 5a44e4f7b58d skipped: already exists  
Copying blob 7dd4876b9485 skipped: already exists  
Copying config 8d6835ed79 done   | 
Writing manifest to image destination
Error: writing manifest: uploading manifest latest to tools-harbor.wmcloud.org/tool-wm-lol/tool-wm-lol: received unexpected HTTP status: 500 Internal Server Error

On the registry container there's not much more info:

root@tools-harbor-2:/srv/ops/harbor# docker logs --tail 1000 registry 2>&1 | grep 'http.response.status=500'
        status code: 403, request id: tx00000a2a1faca662a9065-0068adaf90-49b407d7-default, host id: " err.message="unknown error" go.version=go1.23.4 http.request.contenttype="application/json" http.request.host=tools-harbor.wmcloud.org http.request.id=2004574e-7ec3-449a-85f0-736f0f84b257 http.request.method=POST http.request.remoteaddr=172.16.19.232 http.request.uri="/v2/tool-wm-lol/tool-wm-lol/blobs/uploads/" http.request.useragent="go-containerregistry/v0.16.1" http.response.contenttype="application/json; charset=utf-8" http.response.duration=222.561846ms http.response.status=500 http.response.written=123 vars.name="tool-wm-lol/tool-wm-lol" 
        status code: 403, request id: tx0000060905c3448eaee52-0068adb0aa-49bbedd8-default, host id: " err.message="unknown error" go.version=go1.23.4 http.request.contenttype="application/json" http.request.host=tools-harbor.wmcloud.org http.request.id=0935e97c-69a5-4a58-9860-24001652967d http.request.method=POST http.request.remoteaddr=172.16.19.232 http.request.uri="/v2/tool-wm-lol/tool-wm-lol/blobs/uploads/" http.request.useragent="go-containerregistry/v0.16.1" http.response.contenttype="application/json; charset=utf-8" http.response.duration=218.273028ms http.response.status=500 http.response.written=123 vars.name="tool-wm-lol/tool-wm-lol" 
        status code: 403, request id: tx0000060e943ea61300b49-0068adb0cf-49b407d7-default, host id: " err.message="unknown error" go.version=go1.23.4 http.request.contenttype="application/json" http.request.host=tools-harbor.wmcloud.org http.request.id=753d249c-4a8a-4ee3-887b-7c856e88d1e1 http.request.method=POST http.request.remoteaddr=172.16.19.232 http.request.uri="/v2/tool-wm-lol/tool-wm-lol/blobs/uploads/" http.request.useragent="go-containerregistry/v0.16.1" http.response.contenttype="application/json; charset=utf-8" http.response.duration=249.815997ms http.response.status=500 http.response.written=123 vars.name="tool-wm-lol/tool-wm-lol" 
        status code: 403, request id: tx00000280e2533d120c7c4-0068adb303-49bbedd8-default, host id: " err.message="unknown error" go.version=go1.23.4 http.request.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.request.host=tools-harbor.wmcloud.org http.request.id=37f62c9a-aafb-478f-b2d3-d389f0bce56b http.request.method=PUT http.request.remoteaddr=172.16.19.232 http.request.uri="/v2/tool-wm-lol/tool-wm-lol/manifests/latest" http.request.useragent="containers/5.35.0 (github.com/containers/image)" http.response.contenttype="application/json; charset=utf-8" http.response.duration=218.635995ms http.response.status=500 http.response.written=123 vars.name="tool-wm-lol/tool-wm-lol" vars.reference=latest 
        status code: 403, request id: tx00000d5b5d4465232db94-0068adb311-49bbedd8-default, host id: " err.message="unknown error" go.version=go1.23.4 http.request.contenttype="application/json" http.request.host=tools-harbor.wmcloud.org http.request.id=616b38c3-43d1-4d72-a653-2143c5de79cf http.request.method=POST http.request.remoteaddr=172.16.19.232 http.request.uri="/v2/tool-wm-lol/tool-wm-lol/blobs/uploads/" http.request.useragent="go-containerregistry/v0.16.1" http.response.contenttype="application/json; charset=utf-8" http.response.duration=222.318978ms http.response.status=500 http.response.written=123 vars.name="tool-wm-lol/tool-wm-lol" 
        status code: 403, request id: tx000005da61ba09228c632-0068adb424-49cf82e5-default, host id: " err.message="unknown error" go.version=go1.23.4 http.request.contenttype="application/json" http.request.host=tools-harbor.wmcloud.org http.request.id=a7e5ec2d-ec11-4dba-8402-39a467499001 http.request.method=POST http.request.remoteaddr=172.16.19.232 http.request.uri="/v2/tool-tools-edit-count/tool-tools-edit-count/blobs/uploads/" http.request.useragent="go-containerregistry/v0.16.1" http.response.contenttype="application/json; charset=utf-8" http.response.duration=274.031515ms http.response.status=500 http.response.written=123 vars.name="tool-tools-edit-count/tool-tools-edit-count"

Using the admin account also fails, so the error is not permissions, but something else

might be worth it to check the storage quota of harborstorage s3 bucket. The fact that is it was working intially but stopped suddenly makes me thing it's something to do with storage quota. Let me check

yeaaaa, I think I see where the problem is coming from.

raymond-ndibe@cloudcontrol1006:~$ sudo radosgw-admin user info --uid tools\$tools
{
    "user_id": "tools$tools",
...
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": true,
        "check_on_raw": false,
        "max_size": 53687091200,
        "max_size_kb": 52428800,
        "max_objects": 51107
    },
...
}

max_size_kb is 52428800 and that's equivalent to 50GB. the storage on horizon is 49.9GB. I just manually ran garbage collection on harbor. Let me see if I can build something rn

Mentioned in SAL (#wikimedia-cloud) [2025-08-26T13:42:06Z] <dcaro> extended object storage quota to 100G (T402923)

Mentioned in SAL (#wikimedia-cloud) [2025-08-26T13:42:06Z] <dcaro> extended object storage quota to 100G (T402923)

yeaa, exactly. This should fix it

@Raymond_Ndibe I increased the quota too, for issues like this, can you drop to irc instead? it's way easier to coordinate

@Raymond_Ndibe I increased the quota too, for issues like this, can you drop to irc instead? it's way easier to coordinate

yeaaa, had no idea you were already on it

dcaro claimed this task.
dcaro moved this task from Next Up to Done on the Toolforge (Toolforge iteration 23) board.

Created a subtask to follow up