We run wmf-pt-kill on the labsdb hosts to kill long queries.
Right now the service runs as a daemon with the following options:
wmf-pt-+ 28800 0.0 0.0 145144 66488 ? Ss Aug16 0:17 perl /usr/bin/wmf-pt-kill --daemon --print --kill --victims all --interval 10 --busy-time 3600 --match-command Query|Execute --match-user ^[spu][0-9] --log /var/log/wmf-pt-kill/wmf-pt-kill.log -S /run/mysqld/mysqld.sock F=/dev/null
This was originally packaged at T203674.
The most recent package was built a few months ago at T248843 for Buster and the new pt-kill version.
The new clouddb hosts will run multi-instance, so using the default socket -S /run/mysqld/mysqld.sock won't work, as mysql will have different socket locations, like we do on normal multi-instance hosts:
root@db1099:/run/mysqld# ls mysqld.s1.sock mysqld.s8.sock
We should change the wmf-pt-kill puppet code to accept socket location maybe based on the hiera files, for instance this is a multi-instance hiera class:
cat hieradata/hosts/db1099.yaml # db1099 # Buffer pool sizes/instance enabled profile::mariadb::core::multiinstance::num_instances: 2 profile::mariadb::core::multiinstance::s1: '185G' profile::mariadb::core::multiinstance::s8: '185G'
Maybe wmf-pt-kill can use those s1 and s8 options and attach itself to those sockets, as they are called mysqld.sX. As probably the new clouddb hosts will need this sort of files to run multiinstance.
If there is no hiera file and/or there is no multi-instance there, wmf-pt-kill should just assume the default socket (like we do know) as both systems will live in parallel for sometime (the old single-instance and the new multi-instance system).
Maybe @Kormat can take the lead of this at this and discuss with cloud-services-team, some different approaches?