We are currently running our kafka clusters with different uid/gids, that is not great when dealing with OS upgrades.
I added some code to facilitate a standardized uid/gid for kafka, 916, that we should rollout to our clusters.
The idea, for each cluster, is to do the following:
- Disable puppet on the target cluster
- File a change like the following and merge it: https://gerrit.wikimedia.org/r/c/operations/puppet/+/743163
- For every node,
- stop kafka and kafka mirror
- execute the script below
- re-enable puppet and run it (to bring back kafka daemons and make sure that the new code works fine).
#!/bin/bash set -x change_uid() { # $1 new uid # $2 username if id "$2" &>/dev/null then OLD_UID=$(id -u $2) usermod -u $1 $2 find / \( -path /proc -o -path /mnt -o -path /sys -o -path /dev -o -path /media \) -prune -false -o -user $OLD_UID -print0 | xargs -0 chown $1 fi } change_gid() { # $1 new gid # $2 username if getent group $2 &>/dev/null then OLD_GID=$(getent group $2 | cut -d ":" -f 3) groupmod -g $1 $2 find / \( -path /proc -o -path /mnt -o -path /sys -o -path /dev -o -path /media \) -prune -false -o -group $OLD_GID -print0 | xargs -0 chgrp $1 fi } ## hdfs change_uid 916 kafka change_gid 916 kafka
I have tested the procedure with Kafka test and it worked fine :)
Clusters to move:
- Jumbo (Data Engineering)
- Test
- Main eqiad (ServiceOps)
- Main codfw (ServiceOps)
- Logging eqiad (Observability)
- Logging codfw (Observability)