Page MenuHomePhabricator

[toolforge, toolforge-cli] Experiment with PyInstaller to package CLI tools for buildpack images
Closed, DeclinedPublic

Description

As a potential solution, or at least a temporary workaround, to make our CLI tools available in buildservice containers, we're tentatively exploring the use of PyInstaller (docs) to create standalone executables.

  • Suggested by @dcaro as a way to package existing Python CLIs for injection into buildpack images
  • PyInstaller bundles the Python runtime, potentially allowing for cross-platform compatibility
  • Initial tests show some challenges with glibc version mismatches between build and target environments:
slavina@pyre:~$ pyenv global 3.12
slavina@pyre:~$ python --version
slavina@pyre:~$ ./test
[3238867] Failed to load Python shared library '/tmp/_MEIFXJYLl/libpython3.12.so.1.0': dlopen: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.38' not found (required by /tmp/_MEIFXJYLl/libpython3.12.so.1.0)
slavina@pyre:~$ ldd --version
ldd (Debian GLIBC 2.36-9+deb12u7) 2.36

The binary was built with glibc 2.39 on fedora, and the test was on debian bookworm. The Python version (or Python being installed at all) on the system executing the binary shouldn't matter as that is the whole point of pyinstaller – it bundles a python interpreter with all the libs.

Next steps

  1. Investigate building the binary in a container matching our buildpack base image (Ubuntu 22.04 LTS)
  2. Attempt to package one of our existing CLI tools

Considerations

  • Requires rebuilding images when CLI tools are updated
  • The webservice CLI might need additional testing
  • Long-term, we may want to continue efforts on OpenAPI CLI generation for a more stable solution

Event Timeline

Slst2020 removed dcaro as the assignee of this task.
  • Requires rebuilding images when CLI tools are updated

This reminded me, this was precisely one of the main problems related with embedding the toolforge binaries (notably tools-webservices) in container images, and what triggered the current CLI/API architecture in the first place.

Not saying this initiative is wrong, but definitely triggered a reflection.

  • Requires rebuilding images when CLI tools are updated

This reminded me, this was precisely one of the main problems related with embedding the toolforge binaries (notably tools-webservices) in container images, and what triggered the current CLI/API architecture in the first place.

Not saying this initiative is wrong, but definitely triggered a reflection.

What alternatives are there? Is there any option right now other than NFS for mounting packages/binaries into the containers? And raw API calls are not user friendly.

What alternatives are there? Is there any option right now other than NFS for mounting packages/binaries into the containers? And raw API calls are not user friendly.

I don't have any :-(

I think previously the narrative was: use the raw API directly.

I think previously the narrative was: use the raw API directly.

If the API changes then the in-container code needs to change to keep up. If I need to rebuild my container to deploy updated locally maintained code or centrally maintained client or sdk code I still need to rebuild the container. Strict API deprecation policies and backwards compatibility shims are potential partial solutions.

Strict API deprecation policies and backwards compatibility shims are potential partial solutions.

Yes, I agree, that's why since day 1 we have versioned API endpoints, with things like /v1/ in the path.

Strict API deprecation policies and backwards compatibility shims are potential partial solutions.

Yes, I agree, that's why since day 1 we have versioned API endpoints, with things like /v1/ in the path.

More specifically T356974: [builds-api,jobs-api,envvars-api,api-gateway] Figure out and document how to do non-backwards compatible changes
We are almost there for that one, next thursday we deploy the last big api path change, and then we can start the deprecation protocol specified there + announce a stable api that people can use (keeping an eye on the protocol there for backwards incompatible changes).

With that and some stats we can at least track the usage of the deprecated APIs (and clis using those), and hopefully the tools doing so and communicate better those changes/help people migrate/create alternatives/etc.

The rebuilding of the containers might be relieved if we do something like auto-updating clis (ex, checking what's the latest version and installing it when running the first time), though that might break some people's flows, they would not break more than just changing the API and leaving the old clis running.

That becomes way way easier if the clis can be deployed as standalone binaries, avoiding having to support multiple package manager repositories (ex. ubuntu/debian) and minimizing the dependencies (thus this task to explore it).
Something that also helps distributing the clients eventually to users (not yet addressed here, but would help in that regard too).

dcaro triaged this task as High priority.Jul 10 2024, 3:42 PM

We are moving on with golang generated code instead