I'll start with a bit of general administrivia. First, our migration of Wikimedia code review & CI to GitLab continues, and we're mindful that people could use regular updates on progress. Second, I need to think through some stuff about the project, and doing that in writing is helpful for all involved. I'm going to try writing occasional blog entries here for both purposes.
Now on to the main topic of this post: Access control for groups and projects on the Wikimedia GitLab instance.
The tl;dr: We've been modeling access to things on GitLab by using groups under /people to contain individual users and then granting those groups access to things under /repos. This has been tricky to explain and doesn't work as well at a technical level as we'd hoped, so we're mostly scrapping the distinction, and moving control of project access to individual memberships in groups under /repos. This should be easier to think about, simpler to manage, and seems like it will suit our needs better. Read on for the nitty-gritty detail.
(Thanks to @Dzahn, @Majavah, @bd808, @AntiCompositeNumber, and @thcipriani for helping me think through the issues underlying this post.)
During the GitLab consultation, when we were working on building up a model of how we'd use GitLab for Wikimedia projects, we wrote up a draft policy for managing users and their access to projects.
GitLab supports Groups. GitLab groups are similar to GitHub's concept of organizations, although the specifics differ. Groups can contain:
- Other, nested groups
- Individual projects (repositories & metadata)
- Users as members; members of other groups can be invited to a group
- A user who is a member of a top-level group is also a member of every group it contains
We've since changed the original draft policy in some small ways - in particular, we decided to move most projects into a top-level /repos group in order to offer shared CI runners (see T292094). You can read the policy we landed on at the latest revision of GitLab/Policy on mediawiki.org.
The basic idea was that we would separate groups out into:
- Sub-groups of /repos: Namespaces for projects, split up by functional area of code
- Sub-groups of /people: Namespaces for individual users, split up by organizational units like:
- Volunteer group
- Teams at organizations such as the WMF, WMDE, etc.
Groups in /people could then be given access to projects under /repos.
Our hope was that this would let us decouple the management of groups of humans from the individual projects they work on, and ease onboarding for new contributors. A new member of the WMF Release Engineering team, for example, could be added to a single group and then have access to all the things they need to do their job.
We intended for most /people groups to be owned by their members, who would in turn have ownership-level access to their projects under /repos, allowing for contributors to a project to manage access and invite new contributors.
As a concrete example:
- https://gitlab.wikimedia.org/repos/cloud contains various cloud services projects
- The Wikimedia Foundation Cloud Services team and volunteer cloud administrators are modeled by membership in:
Problems with this scheme
I've been proceeding under this plan as people request the creation of GitLab project groups, but there turn out to be some problems.
First, it doesn't seem like permission inheritance for nested groups with other groups as members works the way you'd expect & hope: See T300939 - "GitLab group permissions are not inherited by sub-groups for groups of users invited to the parent repo".
Second, users have concerns about equity of access and tight coupling of things like employment with a specific organization to project access. We didn't have any intention of modeling any group of users as second-class citizens within this scheme, but it seems to create the impression of one all the same. It's also striking that the set of projects people work on just isn't that cleanly mapped to any particular organizational structure. Once you've been a technical contributor for a while, you've almost certainly collected responsibilities that no org chart reflects accurately.
Finally, and maybe most importantly, this is a complicated way to do things. People have a hard time thinking about it, and it requires a lot of explanation. That seems bad for an abstraction that we'd like to be basically self-serve for most users.
Mostly, my plan is to use groups closer to how they seem to be designed:
- Sub-groups of /repos will contain both individual contributor memberships and projects
- Except in occasional one-off cases, access should be granted at the level of a containing group rather than at the level of individual projects, so as to avoid micromanaging access to many projects.
- We'll keep /people in mind as a potential solution for some problems (for example, it might be a good tool for synchronizing groups of users from LDAP and granting access to certain projects on that basis), but not rely on it for anything at the moment.
There are some unanswered questions here, but I plan to redraft the policy doc, move existing project layouts to this scheme, and start creating new project groups on this basis in the coming week or so.
My main philosophical takeaway here is that I work with a bunch of anarchists, and it's always best to plan accordingly.
Originally, one of our goals for this migration was avoiding a repeat of the weird, nested morass that is our current set of Gerrit permissions. While it would be a good idea to keep the structure of things on GitLab flatter and easier to think about, I'm no longer that worried about it. Some of the complexity is inherent to any large set of projects and contributors; some of it just reflects a long-lived technical culture that's emergent and largely self-governing, tendencies that nearly always resist well-intentioned efforts to rationalize and map structure to things like official organizational layout.