Page MenuHomePhabricator

mysqldump is not present in Kubernetes container images
Closed, ResolvedPublicFeature

Description

Hi, I'm working on porting Citation Hunt batch jobs to Kubernetes, which basically means running this script as a cron job every few days to update the tool's database.

The first thing the script does is use mysqldump to create a backup of the existing database, but it seems that, when run on Kubernetes, /usr/bin/mysqldump doesn't exist. If it helps, the ID of the container image is:

docker-pullable://docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base@sha256:39273508641cfe0eda6edbd8eb76852d05cc7ab58f025c71f514a7bee1f7ed2a

Would it be reasonable to have mysqldump in the images? Or should I continue to run backup jobs on the grid?

Event Timeline

We could create a "mysql" container that has mysql, mysqldump, etc. on top of the standalone image.

We could create a "mysql" container that has mysql, mysqldump, etc. on top of the standalone image.

@MusikAnimal, would a solution like this, basically making an image mostly for doing mysqldump actions, work for the set of tools you have indicated are blocked on this task?

We could create a "mysql" container that has mysql, mysqldump, etc. on top of the standalone image.

@MusikAnimal, would a solution like this, basically making an image mostly for doing mysqldump actions, work for the set of tools you have indicated are blocked on this task?

Yes, I think so. The remaining grid engine jobs for musikbot and grantmetrics are only mysql backup scripts written in bash. commtech-commons has one script that needs Python 3, but at quick review it seems I should be able to migrate that one (I'm not sure why I didn't already). The other cron job is a cleanup script that uses the sql local command, but if that shortcut is not available we can use the longer-form mysql equivalent. That script also uses logrotate, if that matters.

The other cron job is a cleanup script that uses the sql local command, but if that shortcut is not available we can use the longer-form mysql equivalent.

sql comes from our local "misctools" deb package. That package also includes the become, take and the oge-crontab script. become and oge-crontab don't make a lot of sense from inside a Kubernetes container, but they are tiny and wouldn't really hurt anything.

That script also uses logrotate, if that matters.

That seems like a reasonable package to add to the same image too.

Change 842993 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[operations/docker-images/toollabs-images@master] mysql: new image for mysql backups

https://gerrit.wikimedia.org/r/842993

bd808 changed the task status from Open to In Progress.Oct 17 2022, 12:56 AM
bd808 claimed this task.
bd808 changed the subtype of this task from "Task" to "Feature Request".

Change 842993 merged by jenkins-bot:

[operations/docker-images/toollabs-images@master] mariadb: new image for mariadb/mysql backups

https://gerrit.wikimedia.org/r/842993

Mentioned in SAL (#wikimedia-cloud-feed) [2023-02-17T11:17:06Z] <wm-bot2> deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/image-config (7729b18) (T254636) - cookbook ran by arturo@endurance

Mentioned in SAL (#wikimedia-cloud-feed) [2023-02-17T11:31:37Z] <wm-bot2> deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/image-config (7729b18) (T254636) - cookbook ran by arturo@endurance

aborrero subscribed.

Done:

tools.arturo-test-tool@tools-sgebastion-11:~$ toolforge-jobs images
Short name    Container image URL
------------  ----------------------------------------------------------------------
bullseye      docker-registry.tools.wmflabs.org/toolforge-bullseye-sssd:latest
golang1.11    docker-registry.tools.wmflabs.org/toolforge-golang111-sssd-base:latest
jdk17         docker-registry.tools.wmflabs.org/toolforge-jdk17-sssd-base:latest
mariadb       docker-registry.tools.wmflabs.org/toolforge-mariadb-sssd-base:latest
mono6.8       docker-registry.tools.wmflabs.org/toolforge-mono68-sssd-base:latest
node16        docker-registry.tools.wmflabs.org/toolforge-node16-sssd-base:latest
perl5.32      docker-registry.tools.wmflabs.org/toolforge-perl532-sssd-base:latest
php7.4        docker-registry.tools.wmflabs.org/toolforge-php74-sssd-base:latest
python3.9     docker-registry.tools.wmflabs.org/toolforge-python39-sssd-base:latest
ruby2.1       docker-registry.tools.wmflabs.org/toolforge-ruby21-sssd-base:latest
ruby2.7       docker-registry.tools.wmflabs.org/toolforge-ruby27-sssd-base:latest
tcl8.6        docker-registry.tools.wmflabs.org/toolforge-tcl86-sssd-base:latest
tools.arturo-test-tool@tools-sgebastion-11:~$ toolforge-jobs run mariadb --command 'sleep 3600' --image mariadb
tools.arturo-test-tool@tools-sgebastion-11:~$ kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE
mariadb-49nnz           1/1     Running   0          8s
test-6d76568d94-mbzrh   1/1     Running   1          23d
test2-bcb6c74d9-6ncd7   1/1     Running   0          3d16h
tools.arturo-test-tool@tools-sgebastion-11:~$ kubectl exec -it mariadb-49nnz -- bash
tools.arturo-test-tool@mariadb-49nnz:~$ sql -h
usage: sql [-h] [-v] [-N] [--cluster {analytics,web}] DATABASE ...
[..]
tools.arturo-test-tool@mariadb-49nnz:~$ mysql --help
mysql  Ver 15.1 Distrib 10.5.18-MariaDB, for debian-linux-gnu (x86_64) using  EditLine wrapper
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Usage: mysql [OPTIONS] [database]
[..]

thanks everyone involved!

Hi! I tried using this but ran into some problems (T319779#9278315):

tools.grantmetrics@tools-sgebastion-10:~$ toolforge-jobs run backup --command ./var/backups/backup.sh --image mariadb
tools.grantmetrics@tools-sgebastion-10:~$ cat backup.err 
mysqldump: Got error: 2002: "Can't connect to local server through socket '/run/mysqld/mysqld.sock' (2)" when trying to connect
ERROR 2002 (HY000): Can't connect to local server through socket '/run/mysqld/mysqld.sock' (2)

The relevant part of the script:

GMDIR=/data/project/grantmetrics
DATE=$(date +%Y-%m-%d)

# Dump and compress whole database.
mysqldump --defaults-extra-file=$GMDIR/replica.my.cnf \
  --single-transaction s53550__grantmetrics \
  | gzip -c > $GMDIR/var/backups/grantmetrics_$DATE.sql.gz

I thought maybe the directory was simply wrong in the context of the k8s container, but I printed $PWD and saw it's still the right path. So something else is preventing it from connecting to the tools-db.

@MusikAnimal Surely the mysqldump command needs a --host argument? The error message indicates it's trying to connect to the local mysql socket, but there's no mariadb/mysql running on the container itself, so it errors out.

Surely the mysqldump command needs a --host argument? The error message indicates it's trying to connect to the local mysql socket, but there's no mariadb/mysql running on the container itself, so it errors out.

Thanks! I had tried that the other day and got the same error. Your comment however prompted me to look again, only to find there was simply a second mysql command that was missing the host :face_palm:

Sorry for the noise!

For anyone who lands here via search, the docs have a cleaner recommendation for backups than the script at T254636#9385420.