Page MenuHomePhabricator

Buildkit erroring with "cannot reuse body, request must be retried" upon multi-platform push
Open, Needs TriagePublic

Description

(edit: was "WMF container registry does not accept a manifest list (aka OCI manifest index, or "fat" manifest" but retitled as this seems to be a condition on the client side)


When attempting to publish a multi-platform image to the WMF container registry, buildctl/BuildKit fails with: error: failed to solve: failed commit on ref "manifest-sha256:b1f76c874d0417382df17036e0f8a732281762fb85694d02378e3a175d1c103f": cannot reuse body, request must be retried

See https://gitlab.wikimedia.org/repos/releng/blubber/-/jobs/28242

...
#27 exporting to image
#27 exporting layers
#27 exporting layers 0.9s done
#27 exporting manifest sha256:b1f76c874d0417382df17036e0f8a732281762fb85694d02378e3a175d1c103f done
#27 exporting config sha256:268701062a811129e68e6d0f9a53bdd64fea1e8e083a955f4cfc98aacd591322 done
#27 exporting manifest sha256:76c748e512209fca21610a44789f4911e8ac6526a93210678a44398c9fb28041 done
#27 exporting config sha256:81c11b26f86c1d88d784580e9d7567714347712696a637d4066ab9e63bdb9ca0 done
#27 exporting manifest list sha256:9828d7ddccdf65bde8b25d05794309794c2c3b87d185fe0d00ff4824ef47f503 done
#27 pushing layers
#27 pushing layers 2.2s done
#27 pushing manifest for docker-registry.discovery.wmnet/repos/releng/blubber/buildkit:v0.14.0@sha256:9828d7ddccdf65bde8b25d05794309794c2c3b87d185fe0d00ff4824ef47f503
#27 pushing manifest for docker-registry.discovery.wmnet/repos/releng/blubber/buildkit:v0.14.0@sha256:9828d7ddccdf65bde8b25d05794309794c2c3b87d185fe0d00ff4824ef47f503 0.3s done
#27 ERROR: failed commit on ref "manifest-sha256:b1f76c874d0417382df17036e0f8a732281762fb85694d02378e3a175d1c103f": cannot reuse body, request must be retried
------
 > exporting to image:
------
error: failed to solve: failed commit on ref "manifest-sha256:b1f76c874d0417382df17036e0f8a732281762fb85694d02378e3a175d1c103f": cannot reuse body, request must be retried

This may be related to an upstream bug in containerd. Note that we're using OCI workers for building, not containerd. However, containerd is used to push.

There seems to have been a fix merged as well, so I'll see about getting this into our BuildKit build.

Event Timeline

I think this might require an update of our docker-registry (which we have not planned for currently). Is this blocking something apart from testing with multi arch images (T272500)?

I think this might require an update of our docker-registry (which we have not planned for currently). Is this blocking something apart from testing with multi arch images (T272500)?

It's possible this blocks T321316: Self-build and publish buildkit helper images as the dockerfile-copy image we're currently trying to vendor/rebuild is a multi-platform image.

Do you see anything in the registry access logs from gitlab-runner1004.eqiad.wmnet around 2022-11-04T22:43:00Z?

I took a quick look and AIUI our registry does support application/vnd.docker.distribution.manifest.list.v2+json already. That is really not a new thing and we are not that far behind upstream releases.

@hashar triggered a retry of said job for me (thanks!) with only one registry pooled to rule out strange caching behavior, these are the logs we got. Nothing jumping my eye apart from the fact that I don't see the client trying to PUT the manifest in question:

Nov  8 09:39:25 registry2003 docker-registry[7573]: time="2022-11-08T09:39:25.27595589Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=docker-registry.wikimedia.org http.request.id=8e8f0f24-b179-4043-8cc9-880a4298a988 http.request.method=GET http.request.remoteaddr=10.192.16.161 http.request.uri="/v2/repos/releng/blubber/buildkit/manifests/v0.13.1" http.request.useragent="buildkit/v0.10-dev" http.response.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.response.duration=160.805238ms http.response.status=200 http.response.written=739                                 
Nov  8 09:39:25 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:39:25 +0000] "GET /v2/repos/releng/blubber/buildkit/manifests/v0.13.1 HTTP/1.1" 200 739 "" "buildkit/v0.10-dev"                                                                                                                      
Nov  8 09:39:25 registry2003 docker-registry[7573]: time="2022-11-08T09:39:25.409484236Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=registry http.request.id=40b2e36b-84be-4827-8685-39e0ca4422a0 http.request.method=GET http.request.remoteaddr=10.192.16.161 http.request.uri="/v2/repos/releng/blubber/buildkit/manifests/sha256:2c1df3c60855e46964059eeaa48ecb498427e8f1f277f91f2983bf609e479d67" http.request.useragent="buildkit/v0.10-dev" http.response.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.response.duration=97.259281ms http.response.status=200 http.response.written=739
Nov  8 09:39:25 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:39:25 +0000] "GET /v2/repos/releng/blubber/buildkit/manifests/sha256:2c1df3c60855e46964059eeaa48ecb498427e8f1f277f91f2983bf609e479d67 HTTP/1.1" 200 739 "" "buildkit/v0.10-dev"                                                      
Nov  8 09:39:25 registry2003 docker-registry[7573]: time="2022-11-08T09:39:25.592261668Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=registry http.request.id=0e9d542f-47d4-4d67-ad9a-30839f2ce64e http.request.method=GET http.request.remoteaddr=10.192.16.161 http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/sha256:adada3f34e5ac246bfc1947e147a235ecd258627b20e27888ec957f39c2dcc7e" http.request.useragent="buildkit/v0.10-dev" http.response.contenttype="application/octet-stream" http.response.duration=145.370514ms http.response.status=206 http.response.written=1766                    
Nov  8 09:39:25 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:39:25 +0000] "GET /v2/repos/releng/blubber/buildkit/blobs/sha256:adada3f34e5ac246bfc1947e147a235ecd258627b20e27888ec957f39c2dcc7e HTTP/1.1" 206 1766 "" "buildkit/v0.10-dev"                                                         
Nov  8 09:39:26 registry2003 docker-registry[7573]: time="2022-11-08T09:39:26.805306103Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=docker-registry.wikimedia.org http.request.id=f88c12b3-a049-438d-84ae-12d55a9a035c http.request.method=GET http.request.remoteaddr=10.192.16.161 http.request.uri="/v2/golang1.17/manifests/1.17-1" http.request.useragent="buildkit/v0.10-dev" http.response.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.response.duration=143.198981ms http.response.status=200 http.response.written=1156                                                   
Nov  8 09:39:26 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:39:26 +0000] "GET /v2/golang1.17/manifests/1.17-1 HTTP/1.1" 200 1156 "" "buildkit/v0.10-dev"                                                                                                                                         
Nov  8 09:39:26 registry2003 docker-registry[7573]: time="2022-11-08T09:39:26.916429734Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=registry http.request.id=19acd260-7439-46e1-9884-416adc145677 http.request.method=GET http.request.remoteaddr=10.192.16.161 http.request.uri="/v2/golang1.17/manifests/sha256:7dc97c16237f2265ed6932e50ea50da5b2a9e39c5b094efb04394ca3f0acc72d" http.request.useragent="buildkit/v0.10-dev" http.response.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.response.duration=80.839548ms http.response.status=200 http.response.written=1156        
Nov  8 09:39:26 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:39:26 +0000] "GET /v2/golang1.17/manifests/sha256:7dc97c16237f2265ed6932e50ea50da5b2a9e39c5b094efb04394ca3f0acc72d HTTP/1.1" 200 1156 "" "buildkit/v0.10-dev"                                                                        
Nov  8 09:39:26 registry2003 docker-registry[7573]: time="2022-11-08T09:39:26.949655822Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=docker-registry.wikimedia.org http.request.id=ccf9671f-efdc-400d-86cc-16f03bd04894 http.request.method=HEAD http.request.remoteaddr=10.192.16.161 http.request.uri="/v2/golang1.17/manifests/1.17-1" http.request.useragent="buildkit/v0.10-dev" http.response.contenttype="application/vnd.docker.distribution.manifest.v2+json" http.response.duration=137.910554ms http.response.status=200 http.response.written=1156                                                  
Nov  8 09:39:26 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:39:26 +0000] "HEAD /v2/golang1.17/manifests/1.17-1 HTTP/1.1" 200 1156 "" "buildkit/v0.10-dev"                                                                                                                                        
Nov  8 09:39:27 registry2003 docker-registry[7573]: time="2022-11-08T09:39:27.051731658Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=registry http.request.id=fa8b407c-4a2f-4de3-a96c-7d0276574296 http.request.method=GET http.request.remoteaddr=10.192.16.161 http.request.uri="/v2/golang1.17/blobs/sha256:81c44e9d392aff45d5a10dd4535a0c94f66bac68c83d207925ad8dd17002cdcd" http.request.useragent="buildkit/v0.10-dev" http.response.contenttype="application/octet-stream" http.response.duration=120.188232ms http.response.status=206 http.response.written=3826                                       
Nov  8 09:39:27 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:39:26 +0000] "GET /v2/golang1.17/blobs/sha256:81c44e9d392aff45d5a10dd4535a0c94f66bac68c83d207925ad8dd17002cdcd HTTP/1.1" 206 3826 "" "buildkit/v0.10-dev"                                                                            
Nov  8 09:40:38 registry2003 docker-registry[7573]: time="2022-11-08T09:40:38.231877138Z" level=error msg="response completed with error" err.code="blob unknown" err.detail=sha256:2304af3f0bad73d77281d9ed64450410263f761b2d537ace4a1a8de8612789a3 err.message="blob unknown to registry" go.version=go1.11.4 http.request.host=registry http.request.id=cd2b75f5-b6ad-4fe2-a6ba-e6ae875dba88 http.request.method=GET http.request.remoteaddr="127.0.0.1:60074" http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/sha256:2304af3f0bad73d77281d9ed64450410263f761b2d537ace4a1a8de8612789a3" http.request.useragent="buildkit/v0.10-dev" http.response.contenttype="application/json; charset=utf-8" http.response.duration=61.484864ms http.response.status=404 http.response.written=157 vars.digest="sha256:2304af3f0bad73d77281d9ed64450410263f761b2d537ace4a1a8de8612789a3" vars.name="repos/releng/blubber/buildkit"                                     
Nov  8 09:40:38 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:38 +0000] "GET /v2/repos/releng/blubber/buildkit/blobs/sha256:2304af3f0bad73d77281d9ed64450410263f761b2d537ace4a1a8de8612789a3 HTTP/1.1" 404 157 "" "buildkit/v0.10-dev"                                                          
Nov  8 09:40:38 registry2003 docker-registry[7573]: time="2022-11-08T09:40:38.240444594Z" level=error msg="response completed with error" err.code="blob unknown" err.detail=sha256:86d5ff669d1b2bfbdb7bd70679607a40ba519a1bb929bc4669e7d14f602932da err.message="blob unknown to registry" go.version=go1.11.4 http.request.host=registry http.request.id=eef0d217-f3cc-4612-bf5e-6e8d494ed5de http.request.method=GET http.request.remoteaddr="127.0.0.1:60072" http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/sha256:86d5ff669d1b2bfbdb7bd70679607a40ba519a1bb929bc4669e7d14f602932da" http.request.useragent="buildkit/v0.10-dev" http.response.contenttype="application/json; charset=utf-8" http.response.duration=67.004066ms http.response.status=404 http.response.written=157 vars.digest="sha256:86d5ff669d1b2bfbdb7bd70679607a40ba519a1bb929bc4669e7d14f602932da" vars.name="repos/releng/blubber/buildkit"                                     
Nov  8 09:40:38 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:38 +0000] "GET /v2/repos/releng/blubber/buildkit/blobs/sha256:86d5ff669d1b2bfbdb7bd70679607a40ba519a1bb929bc4669e7d14f602932da HTTP/1.1" 404 157 "" "buildkit/v0.10-dev"                                                          
Nov  8 09:40:38 registry2003 docker-registry[7573]: time="2022-11-08T09:40:38.269692925Z" level=error msg="response completed with error" err.code="blob unknown" err.detail=sha256:d84b41282ac2d330e83207439b63e129bf74fd652ba7a978fbc88e449cef1347 err.message="blob unknown to registry" go.version=go1.11.4 http.request.host=registry http.request.id=73195c8d-1e3f-40e2-9c7e-896295992576 http.request.method=GET http.request.remoteaddr="127.0.0.1:60070" http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/sha256:d84b41282ac2d330e83207439b63e129bf74fd652ba7a978fbc88e449cef1347" http.request.useragent="buildkit/v0.10-dev" http.response.contenttype="application/json; charset=utf-8" http.response.duration=99.538372ms http.response.status=404 http.response.written=157 vars.digest="sha256:d84b41282ac2d330e83207439b63e129bf74fd652ba7a978fbc88e449cef1347" vars.name="repos/releng/blubber/buildkit"                                     
Nov  8 09:40:38 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:38 +0000] "GET /v2/repos/releng/blubber/buildkit/blobs/sha256:d84b41282ac2d330e83207439b63e129bf74fd652ba7a978fbc88e449cef1347 HTTP/1.1" 404 157 "" "buildkit/v0.10-dev"                                                          
Nov  8 09:40:38 registry2003 docker-registry[7573]: time="2022-11-08T09:40:38.315219237Z" level=error msg="response completed with error" err.code="blob unknown" err.detail=sha256:57b6a04c9a60fedb52b13d1a3f14c9f794ecd1545243727feeeb72503b1d4d85 err.message="blob unknown to registry" go.version=go1.11.4 http.request.host=registry http.request.id=b43971ee-691d-4983-93c1-cfee2b705461 http.request.method=GET http.request.remoteaddr="127.0.0.1:60076" http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/sha256:57b6a04c9a60fedb52b13d1a3f14c9f794ecd1545243727feeeb72503b1d4d85" http.request.useragent="buildkit/v0.10-dev" http.response.contenttype="application/json; charset=utf-8" http.response.duration=139.940917ms http.response.status=404 http.response.written=157 vars.digest="sha256:57b6a04c9a60fedb52b13d1a3f14c9f794ecd1545243727feeeb72503b1d4d85" vars.name="repos/releng/blubber/buildkit"                                    
Nov  8 09:40:38 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:38 +0000] "GET /v2/repos/releng/blubber/buildkit/blobs/sha256:57b6a04c9a60fedb52b13d1a3f14c9f794ecd1545243727feeeb72503b1d4d85 HTTP/1.1" 404 157 "" "buildkit/v0.10-dev"                                                          
Nov  8 09:40:38 registry2003 docker-registry[7573]: time="2022-11-08T09:40:38.612707987Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=docker-registry.discovery.wmnet http.request.id=00660bd1-3abd-4752-841e-6f465c1d8ea3 http.request.method=POST http.request.remoteaddr=10.192.16.161
http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/uploads/" http.request.useragent="buildkit/v0.10-dev" http.response.duration=342.488149ms http.response.status=202 http.response.written=0                                                                                                                   
Nov  8 09:40:38 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:38 +0000] "POST /v2/repos/releng/blubber/buildkit/blobs/uploads/ HTTP/1.1" 202 0 "" "buildkit/v0.10-dev"                                                                                                                          
Nov  8 09:40:38 registry2003 docker-registry[7573]: time="2022-11-08T09:40:38.638296026Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=docker-registry.discovery.wmnet http.request.id=3926a9f5-3c75-4cb7-8eb1-3a7fd70fa3d9 http.request.method=POST http.request.remoteaddr=10.192.16.161
http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/uploads/" http.request.useragent="buildkit/v0.10-dev" http.response.duration=368.471841ms http.response.status=202 http.response.written=0                                                                                                                   
Nov  8 09:40:38 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:38 +0000] "POST /v2/repos/releng/blubber/buildkit/blobs/uploads/ HTTP/1.1" 202 0 "" "buildkit/v0.10-dev"                                                                                                                          
Nov  8 09:40:38 registry2003 docker-registry[7573]: time="2022-11-08T09:40:38.743152787Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=docker-registry.discovery.wmnet http.request.id=7268c4a0-eaef-42b6-ac18-8ba2886fe9f4 http.request.method=POST http.request.remoteaddr=10.192.16.161
http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/uploads/" http.request.useragent="buildkit/v0.10-dev" http.response.duration=466.755682ms http.response.status=202 http.response.written=0                                                                                                                   
Nov  8 09:40:38 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:38 +0000] "POST /v2/repos/releng/blubber/buildkit/blobs/uploads/ HTTP/1.1" 202 0 "" "buildkit/v0.10-dev"                                                                                                                          
Nov  8 09:40:38 registry2003 docker-registry[7573]: time="2022-11-08T09:40:38.75960913Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=docker-registry.discovery.wmnet http.request.id=0c34ffaf-e113-4da3-995c-5b75ecda14ec http.request.method=POST http.request.remoteaddr=10.192.16.161 http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/uploads/" http.request.useragent="buildkit/v0.10-dev" http.response.duration=441.781431ms http.response.status=202 http.response.written=0                                                                                                                    
Nov  8 09:40:38 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:38 +0000] "POST /v2/repos/releng/blubber/buildkit/blobs/uploads/ HTTP/1.1" 202 0 "" "buildkit/v0.10-dev"                                                                                                                          
Nov  8 09:40:39 registry2003 docker-registry[7573]: time="2022-11-08T09:40:39.866075472Z" level=info msg="response completed" go.version=go1.11.4 http.request.contenttype="application/octet-stream" http.request.host=docker-registry.discovery.wmnet http.request.id=535c1875-0732-45af-9767-8495998604f2 http.request.method=PUT http.request.remoteaddr=10.192.16.161 http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/uploads/0a3bf978-932e-4921-b998-e79ce3dc925e?_state=x2CSt0fWRDC3bXJvoWzdbFg2ol7__8uguY0GnjJDxid7Ik5hbWUiOiJyZXBvcy9yZWxlbmcvYmx1YmJlci9idWlsZGtpdCIsIlVVSUQiOiIwYTNiZjk3OC05MzJlLTQ5MjEtYjk5OC1lNzljZTNkYzkyNWUiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMTEtMDhUMDk6NDA6MzguMzIzMjIwNzk2WiJ9&digest=sha256%3A57b6a04c9a60fedb52b13d1a3f14c9f794ecd1545243727feeeb72503b1d4d85" http.request.useragent="buildkit/v0.10-dev" http.response.duration=1.096871397s http.response.status=201 http.response.written=0                   
Nov  8 09:40:39 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:38 +0000] "PUT /v2/repos/releng/blubber/buildkit/blobs/uploads/0a3bf978-932e-4921-b998-e79ce3dc925e?_state=x2CSt0fWRDC3bXJvoWzdbFg2ol7__8uguY0GnjJDxid7Ik5hbWUiOiJyZXBvcy9yZWxlbmcvYmx1YmJlci9idWlsZGtpdCIsIlVVSUQiOiIwYTNiZjk3OC05MzJlLTQ5MjEtYjk5OC1lNzljZTNkYzkyNWUiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMTEtMDhUMDk6NDA6MzguMzIzMjIwNzk2WiJ9&digest=sha256%3A57b6a04c9a60fedb52b13d1a3f14c9f794ecd1545243727feeeb72503b1d4d85 HTTP/1.1" 201 0 "" "buildkit/v0.10-dev"                                                                              
Nov  8 09:40:39 registry2003 docker-registry[7573]: time="2022-11-08T09:40:39.879147328Z" level=info msg="response completed" go.version=go1.11.4 http.request.contenttype="application/octet-stream" http.request.host=docker-registry.discovery.wmnet http.request.id=55b9a3d4-8311-4b04-812e-c1f853d3a944 http.request.method=PUT http.request.remoteaddr=10.192.16.161 http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/uploads/d652108d-b99b-4248-93f8-d558362eff00?_state=Fs2MZ8QOf-P_HSfv0f2S5JK7Cw-D4wsNVACaeyblfYh7Ik5hbWUiOiJyZXBvcy9yZWxlbmcvYmx1YmJlci9idWlsZGtpdCIsIlVVSUQiOiJkNjUyMTA4ZC1iOTliLTQyNDgtOTNmOC1kNTU4MzYyZWZmMDAiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMTEtMDhUMDk6NDA6MzguMjc4NjYwMzg3WiJ9&digest=sha256%3Ad84b41282ac2d330e83207439b63e129bf74fd652ba7a978fbc88e449cef1347" http.request.useragent="buildkit/v0.10-dev" http.response.duration=1.129097238s http.response.status=201 http.response.written=0                   
Nov  8 09:40:39 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:38 +0000] "PUT /v2/repos/releng/blubber/buildkit/blobs/uploads/d652108d-b99b-4248-93f8-d558362eff00?_state=Fs2MZ8QOf-P_HSfv0f2S5JK7Cw-D4wsNVACaeyblfYh7Ik5hbWUiOiJyZXBvcy9yZWxlbmcvYmx1YmJlci9idWlsZGtpdCIsIlVVSUQiOiJkNjUyMTA4ZC1iOTliLTQyNDgtOTNmOC1kNTU4MzYyZWZmMDAiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMTEtMDhUMDk6NDA6MzguMjc4NjYwMzg3WiJ9&digest=sha256%3Ad84b41282ac2d330e83207439b63e129bf74fd652ba7a978fbc88e449cef1347 HTTP/1.1" 201 0 "" "buildkit/v0.10-dev"                                                                              
Nov  8 09:40:39 registry2003 docker-registry[7573]: time="2022-11-08T09:40:39.909519791Z" level=info msg="response completed" go.version=go1.11.4 http.request.contenttype="application/octet-stream" http.request.host=docker-registry.discovery.wmnet http.request.id=6c3efc92-0590-41e8-8e69-ace8424444ed http.request.method=PUT http.request.remoteaddr=10.192.16.161 http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/uploads/2070e40f-1cf5-4f32-8c80-1415a5718504?_state=-uDMTPszplKPFdzI6h6_Xyy1MOMZHnNns7-ygQlclyx7Ik5hbWUiOiJyZXBvcy9yZWxlbmcvYmx1YmJlci9idWlsZGtpdCIsIlVVSUQiOiIyMDcwZTQwZi0xY2Y1LTRmMzItOGM4MC0xNDE1YTU3MTg1MDQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMTEtMDhUMDk6NDA6MzguMjczMjUyMzA1WiJ9&digest=sha256%3A86d5ff669d1b2bfbdb7bd70679607a40ba519a1bb929bc4669e7d14f602932da" http.request.useragent="buildkit/v0.10-dev" http.response.duration=1.254979655s http.response.status=201 http.response.written=0                   
Nov  8 09:40:39 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:38 +0000] "PUT /v2/repos/releng/blubber/buildkit/blobs/uploads/2070e40f-1cf5-4f32-8c80-1415a5718504?_state=-uDMTPszplKPFdzI6h6_Xyy1MOMZHnNns7-ygQlclyx7Ik5hbWUiOiJyZXBvcy9yZWxlbmcvYmx1YmJlci9idWlsZGtpdCIsIlVVSUQiOiIyMDcwZTQwZi0xY2Y1LTRmMzItOGM4MC0xNDE1YTU3MTg1MDQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMTEtMDhUMDk6NDA6MzguMjczMjUyMzA1WiJ9&digest=sha256%3A86d5ff669d1b2bfbdb7bd70679607a40ba519a1bb929bc4669e7d14f602932da HTTP/1.1" 201 0 "" "buildkit/v0.10-dev"                                                                              
Nov  8 09:40:39 registry2003 docker-registry[7573]: time="2022-11-08T09:40:39.941475388Z" level=info msg="response completed" go.version=go1.11.4 http.request.contenttype="application/octet-stream" http.request.host=docker-registry.discovery.wmnet http.request.id=eca13619-25df-4542-bbea-69c1a68aa399 http.request.method=PUT http.request.remoteaddr=10.192.16.161 http.request.uri="/v2/repos/releng/blubber/buildkit/blobs/uploads/693a526a-54af-4698-bec0-7fad9c076d73?_state=qo-sXaj7igi4KQ_5V6XispGka_FDNur_csKsYwUhq4h7Ik5hbWUiOiJyZXBvcy9yZWxlbmcvYmx1YmJlci9idWlsZGtpdCIsIlVVSUQiOiI2OTNhNTI2YS01NGFmLTQ2OTgtYmVjMC03ZmFkOWMwNzZkNzMiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMTEtMDhUMDk6NDA6MzguMjczMTM3MDcxWiJ9&digest=sha256%3A2304af3f0bad73d77281d9ed64450410263f761b2d537ace4a1a8de8612789a3" http.request.useragent="buildkit/v0.10-dev" http.response.duration=1.233868494s http.response.status=201 http.response.written=0                   
Nov  8 09:40:39 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:38 +0000] "PUT /v2/repos/releng/blubber/buildkit/blobs/uploads/693a526a-54af-4698-bec0-7fad9c076d73?_state=qo-sXaj7igi4KQ_5V6XispGka_FDNur_csKsYwUhq4h7Ik5hbWUiOiJyZXBvcy9yZWxlbmcvYmx1YmJlci9idWlsZGtpdCIsIlVVSUQiOiI2OTNhNTI2YS01NGFmLTQ2OTgtYmVjMC03ZmFkOWMwNzZkNzMiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMTEtMDhUMDk6NDA6MzguMjczMTM3MDcxWiJ9&digest=sha256%3A2304af3f0bad73d77281d9ed64450410263f761b2d537ace4a1a8de8612789a3 HTTP/1.1" 201 0 "" "buildkit/v0.10-dev"                                                                              
Nov  8 09:40:40 registry2003 docker-registry[7573]: time="2022-11-08T09:40:40.093903898Z" level=error msg="response completed with error" err.code="manifest unknown" err.detail="unknown manifest name=repos/releng/blubber/buildkit revision=sha256:24b07abcb824307bb4d16047a850319b38e3555af08377cedc51e1407cb180d5"
err.message="manifest unknown" go.version=go1.11.4 http.request.host=registry http.request.id=ef843c3e-afcb-4723-8a5c-aedf6571a1ad http.request.method=GET http.request.remoteaddr="127.0.0.1:60104" http.request.uri="/v2/repos/releng/blubber/buildkit/manifests/sha256:24b07abcb824307bb4d16047a850319b38e3555af08377cedc51e1407cb180d5" http.request.useragent="buildkit/v0.10-dev" http.response.contenttype="application/json; charset=utf-8" http.response.duration=143.583916ms http.response.status=404 http.response.written=205 vars.name="repos/releng/blubber/buildkit" vars.reference="sha256:24b07abcb824307bb4d16047a850319b38e3555af08377cedc51e1407cb180d5"
Nov  8 09:40:40 registry2003 docker-registry[7573]: 127.0.0.1 - - [08/Nov/2022:09:40:39 +0000] "GET /v2/repos/releng/blubber/buildkit/manifests/sha256:24b07abcb824307bb4d16047a850319b38e3555af08377cedc51e1407cb180d5 HTTP/1.1" 404 205 "" "buildkit/v0.10-dev"

(linking to T320811)

Thanks for debugging this further, @JMeybohm and @hashar. In your recent re-run of the job it seems the manifest list is accepted by the registry, and it's the subsequent manifest push that fails, so you're right this doesn't seem to be related to a lack of manifest list support for perhaps some errant behavior on buildkit's side that manifests under this multi-platform push condition. I'll re-word and triage accordingly.

27 exporting manifest sha256:24b07abcb824307bb4d16047a850319b38e3555af08377cedc51e1407cb180d5 done
#27 exporting config sha256:d84b41282ac2d330e83207439b63e129bf74fd652ba7a978fbc88e449cef1347 done
#27 exporting manifest sha256:339e830b18379f74bf67d6257a0c346327daa931e0c5a752d2998de3a4df6ba6 done
#27 exporting config sha256:57b6a04c9a60fedb52b13d1a3f14c9f794ecd1545243727feeeb72503b1d4d85 done
#27 exporting manifest list sha256:1134f868ab106022c275b5d7c0034c7e2d7c6bbdea292f726d2b673655483be1 done
#27 pushing layers
#27 pushing layers 1.8s done
#27 pushing manifest for docker-registry.discovery.wmnet/repos/releng/blubber/buildkit:v0.14.0@sha256:1134f868ab106022c275b5d7c0034c7e2d7c6bbdea292f726d2b673655483be1
time="2022-11-08T09:40:40Z" level=debug msg="stopping session" spanID=3f66ceb23362436e traceID=c2a85367fe1b636fd5258357d4255842
#27 pushing manifest for docker-registry.discovery.wmnet/repos/releng/blubber/buildkit:v0.14.0@sha256:1134f868ab106022c275b5d7c0034c7e2d7c6bbdea292f726d2b673655483be1 0.2s done
#27 ERROR: failed commit on ref "manifest-sha256:24b07abcb824307bb4d16047a850319b38e3555af08377cedc51e1407cb180d5": cannot reuse body, request must be retried

It looks like is/was a bug in containerd that may be the culprit. I'll have to debug further.

dduvall renamed this task from WMF container registry does not accept a manifest list (aka OCI manifest index, or "fat" manifest) to Buildkit erroring with "cannot reuse body, request must be retried" upon multi-platform push.Tue, Nov 8, 5:27 PM
dduvall updated the task description. (Show Details)

@JMeybohm can you provide the nginx access log entries from that time period as well? I'm trying to rule out auth failure as a factor and docker-registry log entries do not include the subrequests between nginx and jwt-authorizer.

Mentioned in SAL (#wikimedia-releng) [2022-11-08T20:01:22Z] <dduvall> temporarily enabling buildkitd debug logging on gitlab-runner hosts (T322453)

I enabled debug logging for buildkitd on the gitlab-runner hosts and re-ran the failed job. It appears that auth failure may be the root cause—the client is definitely misbehaving but the there should not be auth failure here. It's unclear why.

Nov 08 20:06:28 gitlab-runner1002 docker[1075791]: time="2022-11-08T20:06:28Z" level=debug msg="fetch response received" digest="sha256:86d64a41a479ea8910e7d5b7dc6d2a89311079e408d7e0007ba1b580a29dbb75" mediatype=application/vnd.docker.distribution.manifest.v2+json response.header.connection=keep-alive response.header.content-length=205 response.header.content-type="application/json; charset=utf-8" response.header.date="Tue, 08 Nov 2022 20:06:28 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.server=nginx/1.14.2 response.header.x-cache-status=MISS response.header.x-content-type-options=nosniff response.status="404 Not Found" size=739 url="https://docker-registry.discovery.wmnet/v2/repos/releng/blubber/buildkit/manifests/sha256:86d64a41a479ea8910e7d5b7dc6d2a89311079e408d7e0007ba1b580a29dbb75"
Nov 08 20:06:28 gitlab-runner1002 docker[1075791]: time="2022-11-08T20:06:28Z" level=debug msg="do request" digest="sha256:86d64a41a479ea8910e7d5b7dc6d2a89311079e408d7e0007ba1b580a29dbb75" mediatype=application/vnd.docker.distribution.manifest.v2+json request.header.content-type=application/vnd.docker.distribution.manifest.v2+json request.header.user-agent=buildkit/v0.10-dev request.method=PUT size=739 url="https://docker-registry.discovery.wmnet/v2/repos/releng/blubber/buildkit/manifests/sha256:86d64a41a479ea8910e7d5b7dc6d2a89311079e408d7e0007ba1b580a29dbb75"
Nov 08 20:06:28 gitlab-runner1002 docker[1075791]: time="2022-11-08T20:06:28Z" level=debug msg="fetch response received" digest="sha256:86d64a41a479ea8910e7d5b7dc6d2a89311079e408d7e0007ba1b580a29dbb75" mediatype=application/vnd.docker.distribution.manifest.v2+json response.header.connection=keep-alive response.header.content-length=195 response.header.content-type=text/html response.header.date="Tue, 08 Nov 2022 20:06:28 GMT" response.header.server=nginx/1.14.2 response.header.www-authenticate="Bearer realm=\"https://gitlab.wikimedia.org/jwt/auth\", service=\"jwt_authorizer\", scope=\"repository::push,pull\"" response.status="401 Unauthorized" size=739 url="https://docker-registry.discovery.wmnet/v2/repos/releng/blubber/buildkit/manifests/sha256:86d64a41a479ea8910e7d5b7dc6d2a89311079e408d7e0007ba1b580a29dbb75"
Nov 08 20:06:28 gitlab-runner1002 docker[1075791]: time="2022-11-08T20:06:28Z" level=debug msg=Unauthorized digest="sha256:86d64a41a479ea8910e7d5b7dc6d2a89311079e408d7e0007ba1b580a29dbb75" header="Bearer realm=\"https://gitlab.wikimedia.org/jwt/auth\", service=\"jwt_authorizer\", scope=\"repository::push,pull\"" mediatype=application/vnd.docker.distribution.manifest.v2+json size=739
Nov 08 20:06:28 gitlab-runner1002 docker[1075791]: time="2022-11-08T20:06:28Z" level=error msg="/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = failed commit on ref \"manifest-sha256:86d64a41a479ea8910e7d5b7dc6d2a89311079e408d7e0007ba1b580a29dbb75\": cannot reuse body, request must be retried\n"

I recently modified to improve the validation error reporting, specifically to include the validation error response in the response. I'll see if I can get an updated version of that deployed to the registry hosts.

Mentioned in SAL (#wikimedia-releng) [2022-11-08T20:17:01Z] <dduvall> puppet re-enabled on gitlab-runner hosts (T322453) normal log level will be restored on next puppet run

@JMeybohm can you provide the nginx access log entries from that time period as well? I'm trying to rule out auth failure as a factor and docker-registry log entries do not include the subrequests between nginx and jwt-authorizer.

It looks as if nginx logs confirm your theory:

10.192.16.161 - - [08/Nov/2022:09:40:38 +0000] "HEAD /v2/repos/releng/blubber/buildkit/blobs/sha256:2304af3f0bad73d77281d9ed64450410263f761b2d537ace4a1a8de8612789a3 HTTP/1.1" 404 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:38 +0000] "HEAD /v2/repos/releng/blubber/buildkit/blobs/sha256:86d5ff669d1b2bfbdb7bd70679607a40ba519a1bb929bc4669e7d14f602932da HTTP/1.1" 404 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:38 +0000] "POST /v2/repos/releng/blubber/buildkit/blobs/uploads/ HTTP/1.1" 401 195 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:38 +0000] "POST /v2/repos/releng/blubber/buildkit/blobs/uploads/ HTTP/1.1" 401 195 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:38 +0000] "HEAD /v2/repos/releng/blubber/buildkit/blobs/sha256:d84b41282ac2d330e83207439b63e129bf74fd652ba7a978fbc88e449cef1347 HTTP/1.1" 404 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:38 +0000] "HEAD /v2/repos/releng/blubber/buildkit/blobs/sha256:57b6a04c9a60fedb52b13d1a3f14c9f794ecd1545243727feeeb72503b1d4d85 HTTP/1.1" 404 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:38 +0000] "POST /v2/repos/releng/blubber/buildkit/blobs/uploads/ HTTP/1.1" 202 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:38 +0000] "POST /v2/repos/releng/blubber/buildkit/blobs/uploads/ HTTP/1.1" 202 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:38 +0000] "POST /v2/repos/releng/blubber/buildkit/blobs/uploads/ HTTP/1.1" 202 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:38 +0000] "POST /v2/repos/releng/blubber/buildkit/blobs/uploads/ HTTP/1.1" 202 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:39 +0000] "PUT /v2/repos/releng/blubber/buildkit/blobs/uploads/0a3bf978-932e-4921-b998-e79ce3dc925e?_state=x2CSt0fWRDC3bXJvoWzdbFg2ol7__8uguY0GnjJDxid7Ik5hbWUiOiJyZXBvcy9yZWxlbmcvYmx1YmJlci9idWlsZGtpdCIsIlVVSUQiOiIwYTNiZjk3OC05MzJlLTQ5MjEtYjk5OC1lNzljZTNkYzkyNWUiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMTEtMDhUMDk6NDA6MzguMzIzMjIwNzk2WiJ9&digest=sha256%3A57b6a04c9a60fedb52b13d1a3f14c9f794ecd1545243727feeeb72503b1d4d85 HTTP/1.1" 201 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:39 +0000] "PUT /v2/repos/releng/blubber/buildkit/blobs/uploads/d652108d-b99b-4248-93f8-d558362eff00?_state=Fs2MZ8QOf-P_HSfv0f2S5JK7Cw-D4wsNVACaeyblfYh7Ik5hbWUiOiJyZXBvcy9yZWxlbmcvYmx1YmJlci9idWlsZGtpdCIsIlVVSUQiOiJkNjUyMTA4ZC1iOTliLTQyNDgtOTNmOC1kNTU4MzYyZWZmMDAiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMTEtMDhUMDk6NDA6MzguMjc4NjYwMzg3WiJ9&digest=sha256%3Ad84b41282ac2d330e83207439b63e129bf74fd652ba7a978fbc88e449cef1347 HTTP/1.1" 201 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:39 +0000] "PUT /v2/repos/releng/blubber/buildkit/blobs/uploads/2070e40f-1cf5-4f32-8c80-1415a5718504?_state=-uDMTPszplKPFdzI6h6_Xyy1MOMZHnNns7-ygQlclyx7Ik5hbWUiOiJyZXBvcy9yZWxlbmcvYmx1YmJlci9idWlsZGtpdCIsIlVVSUQiOiIyMDcwZTQwZi0xY2Y1LTRmMzItOGM4MC0xNDE1YTU3MTg1MDQiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMTEtMDhUMDk6NDA6MzguMjczMjUyMzA1WiJ9&digest=sha256%3A86d5ff669d1b2bfbdb7bd70679607a40ba519a1bb929bc4669e7d14f602932da HTTP/1.1" 201 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:39 +0000] "PUT /v2/repos/releng/blubber/buildkit/blobs/uploads/693a526a-54af-4698-bec0-7fad9c076d73?_state=qo-sXaj7igi4KQ_5V6XispGka_FDNur_csKsYwUhq4h7Ik5hbWUiOiJyZXBvcy9yZWxlbmcvYmx1YmJlci9idWlsZGtpdCIsIlVVSUQiOiI2OTNhNTI2YS01NGFmLTQ2OTgtYmVjMC03ZmFkOWMwNzZkNzMiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMTEtMDhUMDk6NDA6MzguMjczMTM3MDcxWiJ9&digest=sha256%3A2304af3f0bad73d77281d9ed64450410263f761b2d537ace4a1a8de8612789a3 HTTP/1.1" 201 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:40 +0000] "HEAD /v2/repos/releng/blubber/buildkit/manifests/sha256:24b07abcb824307bb4d16047a850319b38e3555af08377cedc51e1407cb180d5 HTTP/1.1" 404 0 "-" "buildkit/v0.10-dev"
10.192.16.161 - - [08/Nov/2022:09:40:40 +0000] "PUT /v2/repos/releng/blubber/buildkit/manifests/sha256:24b07abcb824307bb4d16047a850319b38e3555af08377cedc51e1407cb180d5 HTTP/1.1" 401 195 "-" "buildkit/v0.10-dev"