Page MenuHomePhabricator

Install zstd package on Toolforge
Closed, ResolvedPublic

Description

Zstandard/zstd is a compression algorithm/program that achieves roughly equal compression ratios as DEFLATE/gzip with significantly less CPU time. I think it would be useful as an option for compressing e. g. log files or backups. It’s available in stretch/main (zstd 1.1.2-1) and stretch-backports/main (zstd 1.3.8+dfsg-3~bpo9+1).

Event Timeline

@RomaAmorRoma: Please do not assign a task to folks who cannot fix that task.

@LucasWerkmeister do you have a specific use case that requires this or is it more of a "this is a neat thing to play with" sort of request?

I was thinking of a pipeline like this:

ssh toolforge 'mysqldump | zstd' | zstd -d | xz > dump.xz

zstd’s job in this case would be to reduce network traffic without adding too much load on the bastion host; afterwards, on my system, I can trade more of my own CPU time for better compression.

Here are some timings of compressing an example SQL dump:

$ time zstd < 2019-06-08.sql > 2019-06-08.sql.zstd
real    0m0,225s
user    0m0,199s
sys     0m0,027s
$ time gzip < 2019-06-08.sql > 2019-06-08.sql.gz
real    0m1,024s
user    0m0,996s
sys     0m0,027s
$ time bzip2 < 2019-06-08.sql > 2019-06-08.sql.bz2
real    0m2,629s
user    0m2,568s
sys     0m0,017s
$ time xz < 2019-06-08.sql > 2019-06-08.sql.xz
real    0m20,911s
user    0m20,751s
sys     0m0,060s
$ du -sh 2019-06-08.sql*
34M     2019-06-08.sql
6,4M    2019-06-08.sql.bz2
8,4M    2019-06-08.sql.gz
4,5M    2019-06-08.sql.xz
8,3M    2019-06-08.sql.zstd

This doesn’t require zstd, of course, I just think it would be a good choice here (gzip needs ca. 5× the CPU time for the same compression rate).

Change 530547 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/puppet@production] toolforge: provision zstd

https://gerrit.wikimedia.org/r/530547

Change 530547 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: provision zstd

https://gerrit.wikimedia.org/r/530547

Change 530551 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: grid_environ: zstd is only available starting with startch

https://gerrit.wikimedia.org/r/530551

Change 530551 abandoned by Arturo Borrero Gonzalez:
toolforge: grid_environ: zstd is only available starting with startch

Reason:
We don't need this per https://gerrit.wikimedia.org/r/c/operations/puppet/ /530547#message-c68ae7e721eac942acccc1569c15d62d6a07f4a1

https://gerrit.wikimedia.org/r/530551

bd808 claimed this task.
$ sudo -i puppet agent -tv
...snip...
Notice: /Stage[main]/Packages::Zstd/Package[zstd]/ensure: created