We are currently running our kafka clusters with different uid/gids, that is not great when dealing with OS upgrades.
I added some code to facilitate a standardized uid/gid for kafka, 916, that we should rollout to our clusters.
The idea, for each cluster, is to do the following:
# Disable puppet on the target cluster
# File a change like the following and merge it: https://gerrit.wikimedia.org/r/c/operations/puppet/+/743163
# For every node,
## stop kafka and kafka mirror
## execute the script below
## re-enable puppet and run it (to bring back kafka daemons and make sure that the new code works fine).
```
#!/bin/bash
set -x
change_uid() {
# $1 new uid
# $2 username
if id "$2" &>/dev/null
then
OLD_UID=$(id -u $2)
usermod -u $1 $2
find / \( -path /proc -o -path /mnt -o -path /sys -o -path /dev -o -path /media \) -prune -false -o -user $OLD_UID -print0 | xargs -0 chown $1
fi
}
change_gid() {
# $1 new gid
# $2 username
if getent group $2 &>/dev/null
then
OLD_GID=$(getent group $2 | cut -d ":" -f 3)
groupmod -g $1 $2
find / \( -path /proc -o -path /mnt -o -path /sys -o -path /dev -o -path /media \) -prune -false -o -group $OLD_GID -print0 | xargs -0 chgrp $1
fi
}
## hdfs
change_uid 916 kafka
change_gid 916 kafka
```
I have tested the procedure with Kafka test and it worked fine :)
Clusters to move:
[ ] Jumbo (Data Engineering)
[x] Test
[x] Main eqiad (ServiceOps)
[x] Main codfw (ServiceOps)
[ ] Logging eqiad (Observability)
[ ] Logging codfw (Observability)