Page MenuHomePhabricator

Add zstd package to tf-bullseye-std toolforge-jobs image
Closed, ResolvedPublicFeature

Description

Zstandard is a useful compression algorithm/program, available on Toolforge since 2019 (T225380). I think it would be nice to have it in the toolforge-bullseye-standalone container image for the toolforge-jobs framework as well.

(Currently, the image has gzip, but not bzip2, xz or zstd.)

Event Timeline

One use case I had in mind was to extract and compress older logs from uwsgi.log, using a jobs framework to reduce NFS load on the bastion. I did this a while ago for the lexeme-forms tool, running the necessary jobs on the grid, and it worked fine; for the wd-image-positions tool, I now wanted to do it with toolforge-jobs instead, and found that zstd wasn’t available in the container image I’d chosen. (Unfortunately I only noticed that the compressed files were empty after I deleted the original log, so all the pre-2021 data is now gone, but that’s my own fault for not checking the compressed files before overwriting the full log.)

This seems very reasonable to me, zstd plus libzstd is only about ~3MB. Also it seems likely that apt+zstd support will land in Debian soon (see https://balintreczey.hu/blog/hello-zstd-compressed-debs-in-ubuntu/) so zstd is going to end up in the container eventually.

bd808 changed the subtype of this task from "Task" to "Feature Request".Apr 6 2022, 10:01 PM

Change 842992 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[operations/docker-images/toollabs-images@master] bullseye: add bzip2 and zstd compression programs

https://gerrit.wikimedia.org/r/842992

bd808 changed the task status from Open to In Progress.Oct 17 2022, 12:56 AM
bd808 claimed this task.

Change 842992 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[operations/docker-images/toollabs-images@master] bullseye: add bzip2 and zstd compression programs

https://gerrit.wikimedia.org/r/842992

My patch actually adds zstd to bullseye-base which has distinct linage from bullseye-standalone. See T321612: Make tf-bullseye-std point to toolforge-bullseye-sssd rather than toolforge-bullseye-standalone for my argument that we should change which image is actually accessed when using --image tf-bullseye-std. I don't see much value at all in adding this particular compression program to an image ("toolforge-bullseye-standalone") that only exists to provide less attack surface for statically compiled tools (T277749: [Toolforge] Generic webservice not working on Kubernetes).

Change 842992 merged by jenkins-bot:

[operations/docker-images/toollabs-images@master] bullseye: add bzip2 and zstd compression programs

https://gerrit.wikimedia.org/r/842992