Page MenuHomePhabricator

Container Image policy for non-k8s uses
Closed, ResolvedPublic

Description

We have a policy on images for use in the production k8s realm. Could I ask for some similar policy for other sorts of images that we might want to use, please?

To give some context: I'm building a PoC Ceph cluster, and would like to try out upstream's preferred deployment approach (cephadm), which requires container images (which are run on the storage nodes using docker or podman) rather than installing .debs like our current puppetry does.

Ceph upstream publishes container images, which would be the easiest way for me to proceed. But these upstream images are based on centos, and I understand there is considerable reluctance to use upstream images based on a non-Debian base.

The upstream container build process is complicated, but largely involves installing upstream-built binary packages. Upstream do build and release .debs, so I could use those as the basis for Debian-based container images (I expect this still to be a not entirely trivial process). I expect this is the approach I am going to end up taking here. But obviously this is more work and more deviation from upstream than just using upstream's images, both of which are costs.

Or we could go further and build our own .debs from source and make images from those packages. Ceph is quite a pain to build (the build process is complex, involves a lot of git submodules, and takes a lot of time and compute resource), and Debian itself doesn't yet have packages for the latest stable upstream release in sid. I gather that we already have some software (k8s, envoy) which we don't insist on local-compilation for.

I've already spoken to @JMeybohm about this, and they suggested that opening a task to ask for policy to be clarified would be useful here.

Event Timeline

I 'd argue that the policy already covers this, even if it isn't scoped (on purpose) outside of kubernetes production realms.

The biggest issue isn't the non-Debian base but rather the fact that we wouldn't be in control of those images. The linked document in fact states it early on

1. Make it easy to do security updates when necessary (just rebuild all the containers & redeploy)
As further explanation, we want to not be dependent on a multitude of authors/maintainers of upstream images for upgrades/updates as this is not sustainable long term

Put very simply, if we use upstream images, we are at the mercy of the upstream image creator, not just for the software that they create (which would be reasonable) but for everything else contained in the image as well. To make matters worse, if we lack the tooling (we would in this case), we don't even know what's in those images. What do we do then when a new vulnerability in a very low level compenent like libc or runc shows up and the upstream creator isn't, for whatever reason, responsive? There's more arguments to be made of course, aside from the responsiveness of the upstream. Things like visibility inside the images, ability to use the rest of the already built tooling, dependencies on upstream's registry infrastructure, etc.

The above point doesn't require Kubernetes. It makes sense in any kind of distribution channel. It's the production part that matters here, which is why I am arguing the policy already covers this. I see that we are talking about a PoC, but I am assuming that PoC will see some end user traffic eventually to prove itself, hence production (i.e. if this is going to be entirely outside of production, none of this applies).

The compilation part you mention is covered in Kubernetes_Infrastructure_upgrade_policy#Using_existing_upstream_binaries and I think that indeed you can utilize it and avoid the pointless compilation process. The process to build images out of those isn't trivial, but it isn't difficult either. Have reprepro fetch from upstream (there's tons of example in the puppet repo where we fetch debs from upstream) and then submit a patch at https://gerrit.wikimedia.org/g/operations/docker-images/production-images to add an image. The structure should feel familiar (it's Debian inspired) and there's multiple other images in the repo to copy from for the syntax.

Thanks for your comment.

The process to build images out of those isn't trivial, but it isn't difficult either.

I was obviously unclear in what I wrote - I meant that the upstream image build isn't simply "install some packages, away we go". But should be doable.

akosiaris claimed this task.

Thanks for posting the question and I hope I managed to help.

I 'll resolve this in the interest of not having it linger open (we got enough open lingering tasks already). Feel free to reopen though.

I won't reopen this ticket, but I would like to draw your collective attention to this ticket, if I may: T363558: Switch the DPE Ceph cluster to use cephadm management
The use-case is very similar to that discussed here, but the question is about whether it's permissible to use docker engine on the cephosd100[1-5] and cephadm1001 hosts.