Page MenuHomePhabricator

Magnum images need mirrored
Closed, ResolvedPublic

Description

https://docs.openstack.org/magnum/latest/user/#container-infra-prefix
gives information on images that likely need mirrored to avoid problems like the following:

Normal   Scheduled  18m                   default-scheduler  Successfully assigned kube-system/k8s-keystone-auth-w7stz to paws-five-fqacglfoyhhq-master-0
Normal   Pulling    17m (x4 over 18m)     kubelet            Pulling image "docker.io/k8scloudprovider/k8s-keystone-auth:v1.18.0"
Warning  Failed     17m (x4 over 18m)     kubelet            Failed to pull image "docker.io/k8scloudprovider/k8s-keystone-auth:v1.18.0": rpc error: code = Unknown desc = Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
Warning  Failed     17m (x4 over 18m)     kubelet            Error: ErrImagePull
Warning  Failed     16m (x6 over 18m)     kubelet            Error: ImagePullBackOff
Normal   BackOff    3m22s (x64 over 18m)  kubelet            Back-off pulling image "docker.io/k8scloudprovider/k8s-keystone-auth:v1.18.0"

This would involve making copies of, at least:

docker.io/coredns/coredns:1.3.1
quay.io/coreos/etcd:v3.4.6
docker.io/k8scloudprovider/k8s-keystone-auth:v1.18.0
docker.io/k8scloudprovider/openstack-cloud-controller-manager:v1.18.0
gcr.io/google_containers/pause:3.1

This should have someone approve that this is fine.

Event Timeline

If we can avoid mirroring, that would be ideal. I'm concerned as to why we would still be hitting this limit however, as in theory every 6 hours we should get a new block of 100 requests. How many pulls do you expect an average magnum deploy to do?

If we can avoid mirroring, that would be ideal. I'm concerned as to why we would still be hitting this limit however, as in theory every 6 hours we should get a new block of 100 requests. How many pulls do you expect an average magnum deploy to do?

Due to NAT that limit is shared across all of Cloud VPS except the few instances that have floating IPs. That also includes the GitLab runners (T329216).

Adding authentication for pulling the containers from the upstream registry may be possible. Typically this would involve something like docker login -u $USER --password $TOKEN to create a $HOME/.docker/config.json file if docker is in use. I would expect that alternate container runtimes would have a similar functionality. Each named Docker account gets a distinct 200 pulls/6 hours quota. I would guess that all of Magnum could get away with reusing a single account.

If we can avoid mirroring, that would be ideal. I'm concerned as to why we would still be hitting this limit however, as in theory every 6 hours we should get a new block of 100 requests. How many pulls do you expect an average magnum deploy to do?

A new magnum deploy pulls about 5 containers

Due to NAT that limit is shared across all of Cloud VPS except the few instances that have floating IPs. That also includes the GitLab runners (T329216).

Yes, that everything is shared is the main issue

Adding authentication for pulling the containers from the upstream registry may be possible. Typically this would involve something like docker login -u $USER --password $TOKEN to create a $HOME/.docker/config.json file if docker is in use. I would expect that alternate container runtimes would have a similar functionality. Each named Docker account gets a distinct 200 pulls/6 hours quota. I would guess that all of Magnum could get away with reusing a single account.

I kind of did that. In that I added it manually to a the replicasets and deployments pulling from dockerhub and it got it working. We could put in a patch to have this happen upstream, I haven't found the same feature in the documentation. The only questionable detail that I see is that anyone deploying a cluster would have the user/pass for whatever user we make, opening up a potential vector to break things (They could dos magnum deploys by making a bunch of pulls as that user), though I don't, immediately, see a way that this could be damaging beyond that. Perhaps it is worth it until we see it is a problem?

Could we mirror these in a registry? I think this would fall in the domain of 'admin', and would be limited to the specific images magnum needs to pull.

For clarity, dockerhub credentials will be used to mitigate hitting the anonymous limit on the shared egress IP. No mirroring is required. See https://docs.docker.com/docker-hub/download-rate-limit/.