Page MenuHomePhabricator

Gather requirements for build manifest specification
Closed, ResolvedPublic

Description

The Container Cabal has agreed that consuming a bare Dockerfile from project repos would lead to indeterministically built and unmaintainable (from a security standpoint) container images, but there needs to be some simplified and specialized specification for a build manifest that will determine how the application container is eventually built. See meeting notes from 2017-02-14 and 2017-02-21.

Requirements (WIP)

  1. Base image from which the application container is derived (limited to those in a WMF hosted registry, maintained by ops).
  2. A project's system level dependencies (packages from limited WMF or verified third-party sources) perhaps categorized for specific purposes. For example:
    • "core" or "all" as in required for any build of the image
    • "build" as in only necessary as build-time app dependencies which can be removed before the final image is registered
    • "test" as in dependencies only used by test suites
  3. Application level dependencies/libraries (npm, gem, pip, etc.)
    • this should probably just delegate to the package manager's own specification so as not to cause duplication
    • compiler should know how to turn this into respective Dockerfile instructions that would result in efficiently cached intermediate FS layers
  4. Application entrypoint
  5. Test suite entrypoints?
    • defines low-level suites used to verify the built image that can either be run serially or in parallel

Open questions

  1. Does this specification need to cover anything k8s related or should it only be concerned with the building of images (not the running of them)? Unlike the Dockerfile specification, the k8s spec is well thought out and declarative.

Examples in the wild

  • Services already has their own specification for service-runner to generate Dockerfiles. This might make a good starting point for a more generalized spec.
  • A YAML formatted specification by Pearson that's part of their internally developed Jenkins plugin for a container based pipeline
  • A container build-/run-time tool called dgr has its own specification. It seems rather to be quite complicated and still relies on ad-hoc scripts (via hook entrypoints) for constructing images, but there might be some aspects of it worth a look.

Event Timeline

Some related reading material on container standards (both de facto and burgeoning) that illustrates their separation of concerns and overlap.

https://coreos.com/blog/making-sense-of-standards.html

Also, it's probably worth pointing out at this point that 'manifest' is not really the right word for what we're talking about here; a manifest as it's defined in all of the existing specifications is actually a bundled filesystem image (agnostic to how it was built) and runtime metadata about how to set up the container and how to invoke some application binary. What we've been discussing is a slightly higher level abstraction that covers the metadata but also system level package dependencies. The manifest + image will be a product of our build step.

Kubernetes supports parametrizing various runtime variables per container instance: https://kubernetes.io/docs/resources-reference/v1.5/#container-v1

  • entrypoint and args
  • probes (readyness & liveness)
  • image & pull policy
  • volume mounts

Many of these depend on the deployment environment, and can't necessarily be statically determined at build time. It would however be great if the standardized build process discussed here could produce containers that provide a uniform interface to work against, by enforcing uniform conventions for

  • entrypoints (run, test)
  • argument handling and config management (env var conventions, config templating)
  • logging
  • in-container paths for source and data (important for dev environments, bind-mounted data volumes)
dduvall lowered the priority of this task from Medium to Low.Mar 29 2017, 7:42 PM

Over the past few days I've been experimenting with a general build configuration format and wrote a build tool in Go—because, well, I unabashedly wanted to learn something new while experimenting; please don't judge my awful Go code :)—codenamed Blubber that spits out a Dockerfile.

It doesn't cover all of the currently held requirements, but it does showcase what I think might be a couple of useful constructs related to handling different environments (image variants) and producing optimized production images. I would love feedback if anyone has a moment to take a look. Again, it's just one possible starting point and a thought experiment more than anything.

Many of these depend on the deployment environment, and can't necessarily be statically determined at build time. It would however be great if the standardized build process discussed here could produce containers that provide a uniform interface to work against, by enforcing uniform conventions for

  • argument handling and config management (env var conventions, config templating)

Support for argument and env var conventions should be simple enough but I'm not sure whether config templates can be interpolated at build time due to them containing secrets. So this bears much more discussion and developed use cases.

  • logging

Are you talking about a standard set of exposed ports or something more elaborate?

thcipriani assigned this task to dduvall.

Calling this one done, blubber has an example yaml format that can generate Dockerfiles: https://phabricator.wikimedia.org/source/blubber/browse/master/blubber.example.yaml

We may need to revisit this at some point, but preliminarily done.