Page MenuHomePhabricator

Rename gerrit2 unix user to gerrit and assign a fixed uid
Open, Stalled, LowPublic

Description

When switching over Gerrit (ex: T326368 ) we have to rsync the LFS data, search indices and cache which requires a follow up chown -R gerrit2 on each affected base paths. Using a central fixed uid for the user would save us from having to do the chown.

While changing the uid, I'd like to also rename the user from the confusing gerrit2 to gerrit.

Event Timeline

Change 928580 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/puppet@production] admin: reserve gerrit uid/gid

https://gerrit.wikimedia.org/r/928580

Change 928580 merged by Jbond:

[operations/puppet@production] admin: reserve gerrit uid/gid

https://gerrit.wikimedia.org/r/928580

This will most likely wait until (if ever) we migrate to Bookworm.

LSobanski moved this task from Incoming to Backlog on the collaboration-services board.

This will most likely wait until (if ever) we migrate to Bookworm.

We now have gerrit2003 which is on bookworm but is not in use yet.

I'll upload a change to do this but only "if on bookworm" without touching existing prod servers on bullseye.

Change #1082264 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] gerrit: use 'gerrit' instead of 'gerrit2' as system user on bookworm

https://gerrit.wikimedia.org/r/1082264

There is more to this since the whole gerrit site dir is based on the system user, as in:

profile::gerrit::gerrit_site: "/var/lib/gerrit2/review_site"
[gerrit2003:~] $ grep gerrit2 /etc/passwd
gerrit2:x:498:1001::/var/lib/gerrit2:/bin/bash

This task would be simplified quite a bit if we would limit it to reserving the global UID/GID but dropped the part to rename it to just "gerrit". It's still doable though but I'm wondering if that part is worth a lot.

also Bacula backup sets:

modules/profile/manifests/backup/filesets.pp:        includes => [ '/srv/gerrit', '/var/lib/gerrit2' ]

and tests: profile_gerrit_spec.rb, profile_gerrit_proxy_spec.rb, gerrit/spec/classes/gerrit_spec.rb contain the gerrit2 daemon user name.

Either way, this is definitely still a valid TODO:

# TODO convert to systemd::sysuser
user { $daemon_user:              
    ensure     => present,

which then lets us reserve the UID/GID.

While doing that I found we already reserved UID/GID 925 for "gerrit" in https://gerrit.wikimedia.org/r/c/operations/puppet/+/928580

But it was only the reservation. So now just going to use that with systemd::sysuser but only on bookworm.

Change #1082264 merged by Dzahn:

[operations/puppet@production] gerrit: use systemd::sysuser, reserved UID/GID, new name for daemon user

https://gerrit.wikimedia.org/r/1082264

On gerrit2003 (not in production yet):

- Notice: /Stage[main]/Ssh::Server/File[/etc/ssh/userkeys/gerrit2]/ensure: removed
..
Info: /Stage[main]/Gerrit/Systemd::Sysuser[gerrit]/File[/etc/sysusers.d/gerrit.conf]: Scheduling refresh of Exec[Refresh sysusers]
..
Notice: /Stage[main]/Gerrit/Systemd::Sysuser[gerrit]/Group[gerrit]/ensure: created
..
Notice: /Stage[main]/Gerrit/Systemd::Sysuser[gerrit]/User[gerrit]/password: changed [redacted] to [redacted]
..
Notice: /Stage[main]/Gerrit/File[/var/lib/gerrit]/ensure: created
..
Notice: /Stage[main]/Gerrit/File[/var/lib/gerrit/.ssh]/ensure: created
..
Notice: /Stage[main]/Gerrit/File[/var/lib/gerrit2/review_site]/owner: owner changed 'gerrit2' to 'gerrit'
..
Notice: /Stage[main]/Gerrit/File[/srv/gerrit]/owner: owner changed 'gerrit2' to 'gerrit'
..
--- /var/lib/gerrit2/review_site/etc/gerrit.config	2024-10-08 20:58:38.542802153 +0000
-    user = gerrit2
+    user = gerrit
..
/lib/systemd/system/gerrit.service

-User=gerrit2
-Group=gerrit2
+User=gerrit
+Group=gerrit
...

and many more lines like that that I skipped here for readability.

[gerrit2003:~] $ id gerrit
uid=925(gerrit) gid=925(gerrit) groups=925(gerrit)

[gerrit2003:~] $ id gerrit2
uid=498(gerrit2) gid=1001(gerrit2) groups=1001(gerrit2)
[gerrit2003:/var/lib] $ file gerrit*
gerrit:        directory
gerrit2:       directory
gerrit-deploy: directory
root@gerrit2003:/# find / -xdev -uid 925 | wc -l
36

root@gerrit2003:/# find / -xdev -uid 498 | wc -l
97

There are many more because -xdev doesn't only exclude /proc but also /srv since that's a different filesystem too.

Regardless:

Fixed permissions, like we did in the past for cases like this:

All files owned by old user to new user: find / -uid 498 -exec chown gerrit {} \;

All files owned by old group to new group: find -gid 1001 -exec chgrp gerrit {} \;

Then I removed the gerrit2 user from /etc/passwd and the gerrit2 group from /etc/group, verified puppet doesn't add them back and rebooted the machine.

[gerrit2003:~] $ id gerrit2
id: ‘gerrit2’: no such user

[gerrit2003:~] $ id gerrit
uid=925(gerrit) gid=925(gerrit) groups=925(gerrit)

Some of the lfs data had been syncing just at the time and was still owned by the old user. Had to repeat the above.

Now looking good.

Finally I moved the contents of /var/lib/gerrit2 into /var/lib/gerrit.

mv /var/lib/gerrit2/review_site/ /var/lib/gerrit
/var/lib/gerrit2# mv .* /var/lib/gerrit/
diff /var/lib/gerrit/.ssh/ /var/lib/gerrit2/.ssh/
rm -rf /var/lib/gerrit2/.ssh/

This isn't all there is to it though, puppet creates plenty of things in the old path. WIP.

 du -hs /var/lib/gerrit2/review_site/*

4.0K	/var/lib/gerrit2/review_site/bin
100K	/var/lib/gerrit2/review_site/etc
4.0K	/var/lib/gerrit2/review_site/lib
0	/var/lib/gerrit2/review_site/logs
0	/var/lib/gerrit2/review_site/plugins
80K	/var/lib/gerrit2/review_site/static
4.0K	/var/lib/gerrit2/review_site/tmp

du -hs /var/lib/gerrit/review_site/*

32K	/var/lib/gerrit/review_site/bin
92K	/var/lib/gerrit/review_site/cache
4.0K	/var/lib/gerrit/review_site/data
276K	/var/lib/gerrit/review_site/etc
76K	/var/lib/gerrit/review_site/index
4.0K	/var/lib/gerrit/review_site/lib
0	/var/lib/gerrit/review_site/logs
0	/var/lib/gerrit/review_site/plugins
80K	/var/lib/gerrit/review_site/static
4.0K	/var/lib/gerrit/review_site/tmp

Mentioned in SAL (#wikimedia-operations) [2024-10-24T00:44:28Z] <dzahn@cumin2002> START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on gerrit2003.wikimedia.org with reason: in setup and T338470

Mentioned in SAL (#wikimedia-operations) [2024-10-24T00:44:43Z] <dzahn@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on gerrit2003.wikimedia.org with reason: in setup and T338470

This also needs to be fixed. It's the parameters of the rsync command for LFS syncing.

Notice: /Stage[main]/Gerrit/File[/srv/gerrit/data/lfs]/owner: owner changed 498 to 'gerrit' (corrective)
Notice: /Stage[main]/Gerrit/File[/srv/gerrit/data/lfs]/group: group changed 1001 to 'gerrit' (corrective)

Change #1087950 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] backup: add /var/lib/gerrit to gerrit repodata backup filesets

https://gerrit.wikimedia.org/r/1087950

Change #1087950 merged by Dzahn:

[operations/puppet@production] backup: add /var/lib/gerrit to gerrit repodata backup filesets

https://gerrit.wikimedia.org/r/1087950

Change #1087963 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] devtools: update gerrit user from gerrit2 to gerrit

https://gerrit.wikimedia.org/r/1087963

Change #1087967 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] gerrit: add chown parameter to lfs data rsync, ensure daemon_user is used

https://gerrit.wikimedia.org/r/1087967

Change #1087963 merged by Dzahn:

[operations/puppet@production] devtools: update gerrit user from gerrit2 to gerrit

https://gerrit.wikimedia.org/r/1087963

Change #1087967 merged by Dzahn:

[operations/puppet@production] gerrit: add chown parameter to lfs data rsync, ensure daemon_user is used

https://gerrit.wikimedia.org/r/1087967

Change #1088613 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] gerrit: set gerrit site dir Hiera value for new machine gerrit2003

https://gerrit.wikimedia.org/r/1088613

Change #1088613 merged by Dzahn:

[operations/puppet@production] gerrit: set gerrit site dir Hiera value for new machine gerrit2003

https://gerrit.wikimedia.org/r/1088613

Dzahn changed the task status from Open to Stalled.Wed, Jan 29, 8:47 PM

This is fixed for NEW gerrit machines but we are not changing it for the existing gerrit server. So now this is about getting gerrit2003 (T372804) into production and shutting down the old hardware.