Page MenuHomePhabricator

Decision request - What buildpacks to allow and include for toolforge build service beta
Closed, ResolvedPublic

Description

Problem

For the toolforge build service Beta (T267374: [tbs.beta] Create a toolforge build service beta release) we have to decide what buildpacks do we allow (that means allow and test).

Some definitions first:
Note: Build and builder images are different

  • Application image -> This one is the product, it's built by the builder image, by running the buildpacks on top of the build image, and copying the result over to the run image.
  • Builder image -> This one defines what run and build image to use, and which buildpacks to include (ties the run+build to buildpacks, it's the one that builds the final application image)
  • run + build image -> This ones defines the image the code will be build with and run in (ex. decides which OS will be running), this is sometimes called stack
  • buildpacks -> Scripts that will build and prepare the app image (run at build time, define the tooling to build, how to run it, are tied to the build and run image through the OS)

There's three things to decide here, and they are bound to each other (so not easy to decide separately):

  • The buildpacks to use
  • The build+run image to use
  • The builder image to use

There's some upstream built builders, run and build images, and buildpacks from different organizations:

  • Paketo
  • Heroku
  • Google
  • Us (wikimedia cloud services)

Some details:

Current house-made buildpacks:

  • Languages:
    • Python
  • Build tools:
    • pip + uwsgi
  • Http Servers:
    • uwsgi
  • Base OS: Debian
  • Community buildpacks: Our own community! :)
  • Allows to install packages: limited (not tested), through our own script

Upstream officially supported buildpacks:

  • Packeto:
    • Languages:
      • Go, Java, Node.js, PHP, Python, Ruby, Static files, .Net
    • Build tools:
    • Http servers:
      • Nginx
      • Httpd
      • Your own (through procfile, ex. gunicorn, uwsgi, node, ...)
    • Base OS: Ubuntu
    • Community buildpacks: They all are, paketo is an Open Source project
    • Allows to install packages: no, we have to build our own buildpack (might be able to cross-use the heroku one)
  • Google Cloud Buildpacks
    • Languages:
      • Go, Java, Node.js, PHP, Python, Ruby, .Net
    • Http Servers:
      • Your own (through procfile)
    • Base OS: Ubuntu
    • Community buildpacks: None (you have to find your own)
    • Allows to install packages: no, we have to build our own buildpack (might be able to cross-use the heroku one)

Constraints and risks

  • A first subset of them have to be decided before the beta, with the possibility of expanding them after.
  • We have to be able to restrict to some extent the buildpacks allowed so it's not easy to break the terms of usage of toolforge (only open-source software).
  • We have to be able to support a set of common language build flows, with best practices and how-to's
  • We will probably have to maintain the selected choice for a long time
  • We might want to change direction after/during the Beta/Alpha/RC stages
  • For the multistack buildpacks, we might have to maintain our own builder image

Decision record

https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/Decision_record_T330102_What_buildpacks_to_allow_and_include_for_toolforge_build_service_beta

Options

Option 1

Use our own builder with a subset of the selected upstream buildpacks, starting with Heroku Python one, decide after which ones to continue supporting.

Pros:

  • Upstream maintained buildpacks
  • Upstream maintained run and build image
  • Shared best practices with upstream platforms
  • Users might be familiar with other providers, ease to move between them when needed
  • Wide support of languages, build tools, and web servers
  • We can limit which buildpacks to put in the builder
  • Possibility to add selected non-official buildpacks

Cons:

  • Not Debian
  • We have to maintain our own builder image

Option 2

Use upstream builder with included official buildpacks.

Pros:

  • Upstream maintained buildpacks
  • Upstream maintained run and build image
  • Shared best practices with upstream platforms
  • Users might be familiar with other providers, ease to move between them when needed
  • Wide support of languages, build tools, and web servers

Cons:

  • Not debian
  • No control on which buildpacks are included or not
  • No possibility to add selected non-official buildpacks

Option 3

Support our builder + buildpacks, and upstream builders too

Pros:

  • Debian based buildpacks if wanted
  • Minimal difference with current code setup (no need to change anything)
  • Upstream maintained buildpacks and stacks
  • Shared best practices with upstream platforms
  • Users might be familiar with other providers, ease to move between them when needed (for the ones that did not use our buildpacks)
  • Wide support of languages, build tools, and web servers

Cons:

  • Maintain our own buildpacks for each of the langs we want to support
  • Maintain our own base run and build images (debian)
  • Maintain our own builder image
  • No control on which buildpacks are included or not (due to upstream builders)
  • No possibility to add selected non-official buildpacks (due to non-compatibility of our build+run image with upstream buildpacks)

Option 4

Support our builder + buildpacks, and limited upstream buildpacks

Pros:

  • Debian based buildpacks if wanted
  • Minimal difference with current code setup (no need to change anything)
  • Upstream maintained buildpacks and stacks
  • Shared best practices with upstream platforms
  • Users might be familiar with other providers, ease to move between them when needed (for the ones that did not use our buildpacks)
  • Wide support of languages, build tools, and web servers
  • Control on which buildpacks are included or not
  • Possibility to add selected non-official buildpacks

Cons:

  • Maintain our own buildpacks for each of the langs we want to support
  • Maintain our own base run and build images (debian)
  • Maintain our own builder image

Event Timeline

dcaro renamed this task from Decision request template - What buildpacks to allow and include for toolforge build service beta to Decision request - What buildpacks to allow and include for toolforge build service beta.Feb 20 2023, 6:02 PM
dcaro updated the task description. (Show Details)

On a quick read perhaps Option 4 provides the most value/flexibility?

On a quick read perhaps Option 4 provides the most value/flexibility?

It's also the one that requires the most maintenance (our own buildpacks + run image + build image + our builder + upstream builder with selected buildpacks).

I guess that what will make the difference is the ratio features/maintenance(support + keep things running).

First of all, thank you for defining terms at the top. I can tell you tried to be clear and helpful in giving everyone context for making this decision. I would pick ideally option 2.

My default answer would be to use as much upstream and community buildpacks and build images as possible. This will mean the run image is likely to be non-debian (ubuntu). I don't believe this is an issue as the runtime OS shouldn't matter. That said, ubuntu being a debian derivative also makes that easier.

So answering the questions:
Which buildpacks to use?
Upstream community buildpacks. Specifically, Packeto would be my choice as it appears to be run in a much more open and community friendly way. I would be concerned about maintenance on any buildpack created by WMCS or the wider Wikimedia community. In my opinion, the ability to reuse shared buildpacks is one of the primary benefits of adoption.

Which build+run image to use?
I don't think the runtime image should matter; so definitely default to community standard.

Which builder image to use?
This is the one place it's possible we may we wish to create and maintain our own images. I would still definitely prefer not to do this for; especially for base images. However, it seems some evolving options upstream might provide enough flexibility we don't need our own image?

@dcaro, can extensions https://github.com/buildpacks/spec/blob/main/image_extension.md support enough use cases to ensure we don't need to build our own builder image? If not, what about in conjunction with the proposed https://github.com/buildpacks/rfcs/blob/main/text/0105-dockerfiles.md ? If I understand things correctly, this would allow us to modify the builder image 'on the fly'. This should allow us to for example apt-get dependencies, etc. Perhaps we could even modify the run image in a similar manner.

@dcaro, can extensions https://github.com/buildpacks/spec/blob/main/image_extension.md support enough use cases to ensure we don't need to build our own builder image?

I think you mean build+run images right? (the ones the application will be build on top of, and run on top of)

It has to be implemented by the platform (and tekton does not implement it yet), so not yet available for us. It would be a nicer workaround than:
https://elements.heroku.com/buildpacks/heroku/heroku-buildpack-apt (only for heroku ubuntu-based build+run images, note also that they use it with heroku's multibuildpack feature, something they have implemented on their own and not available on tekton, we can "fake" it by customizing the builder image, more details here T325799)

If not, what about in conjunction with the proposed https://github.com/buildpacks/rfcs/blob/main/text/0105-dockerfiles.md ? If I understand things correctly, this would allow us to modify the builder image 'on the fly'. This should allow us to for example apt-get dependencies, etc. Perhaps we could even modify the run image in a similar manner.

This one builds on top of the spec defined in the previous one, you can see (here https://github.com/buildpacks/rfcs/issues/224). It's the implementation of the spec with an implementation of it on a "platform", pack in this case (buildpacks lingo for software that uses the buildpacks+lifecycle image, pack is one, tekton is another, heroku has it's own, ...).

So yes, these will allow us to customize the build+run image in ways that buildpacks can't, but it might take some time for them to get implemented in tekton, so we can't rely on them right now.
As an alternative, we can use the above buildpack + heroku build+run image, but there might not be an alternative for other build+run images (ex. packeto or our own, though we partially cloned it here https://gerrit.wikimedia.org/r/plugins/gitiles/operations/docker-images/toollabs-images/+/refs/heads/master/bullseye0/build/install-packages, and adapted it to debian).

dcaro updated the task description. (Show Details)
dcaro triaged this task as High priority.
dcaro updated the task description. (Show Details)

Was a custom builder actually ever implemented? If not, do we want to re-visit that decision or should I just file a task for implementing it?

We discovered how to add custom buildpacks to the upstream builder, so we changed to use the upstream builder (as that was the only reason for us to have our own).

Yes, but that seems to come with the cost of increased complexity and risk (having to update the version in T353566#9410605) and making local testing more difficult?

Yes, but that seems to come with the cost of increased complexity and risk (having to update the version in T353566#9410605) and making local testing more difficult?

That issue would still be there, the upgrade of our builder to use the latest supported upstream would still require us doing all that is in that upgrade procedure (including layering the buildpacks to work as cloud-native, updating their versions, ...).
Would also require us to package the buildpacks in one of the currently supported formats though (a container image, or a tarfile with the image layers), that would mean creating a new image for each extra buildpack too.

It would allow to use the injected buildpacks locally using pack, as they would be bundled with the builder image already.