Page MenuHomePhabricator

Figure out submodule updating in GitLab
Open, Needs TriagePublic

Description

Gerrit can update submodules in a containing project when the submodule's project changes. This is used, for example, by our workflow for deploying backports to MediaWiki release branches. It's also been mentioned as potentially useful for frontend tasks.

Does GitLab support anything similar, or will it need to be implemented as a job?

It seems to be at least supported as an action in the API.

Some relevant docs:

Event Timeline

The mediawiki/extensions and mediawiki/skins repositories with auto-updating submodules is great in absence of a real monorepo. And submodules in mediawiki/core work great for release branches too.

But honestly Gerrit's auto updating has been magical and hard to debug/understand when it doesn't work (e.g. T259832), so I hope whatever replaces it is more predictable and easier to debug. I remember back when Gerrit didn't support bumping mediawiki/extensions/VisualEditor because a VisualEditor/VisualEditor repo also existed, we had Jenkins do it as a post-merge job so there's some precedent in that.

The mediawiki/extensions and mediawiki/skins repositories with auto-updating submodules is great in absence of a real monorepo. And submodules in mediawiki/core work great for release branches too.

A real monorepo would be nice for this use-case and dramatically simplifies many other uses: Cut a new train? Make a new branch, delete non-prod extensions. New tarball release? Delete non-bundled extensions, zip it up, and ship it. Code search? → git grep.

A monorepo makes the code review ACL harder but a great many things more manageable. I'm not sure if I have the appetite for this can of worms, but: why did we go with the current model vs. a monorepo?

A monorepo makes the code review ACL harder but a great many things more manageable. I'm not sure if I have the appetite for this can of worms, but: why did we go with the current model vs. a monorepo?

We consciously chose against a mono-repo when moving into the world of git; the original code in CVS was a production mono-repo, and when we were on SVN we effectively had a mono-repo, though a lot of work was done to move code (e.g. the skins, or the Math or Cite extensions) into individual folders as logical (but not actual) repos. When 'we' (Chad/hashar/Roan) moved from SVN to git, we intentionally split out each repo individually for space/effectiveness reasons (devs should have to download multiple GiB of history to work on an extension, etc.).

We've since become a lot more strict about ACLs, not less, so any changes would have to be very carefully thought through, of course.

In addition to what James said, AIUI Git's support for very large monorepos was...not great back then, it's only recently that Microsoft/Facebook/Google have been pushing on it. But even then it still requires a significant amount of disk space, bandwidth, etc. if you just want to hack on one extension. Given that we have contributors who e.g. use Raspberry Pis, I think our split makes sense. We've optimized for new/casual contributors but for "power" users who want the convenience of a monorepo, we have the giant submodule repos. As much as I'd personally love a real monorepo, I don't think it's in our project/community's long-term interests. The submodule repos aren't perfect since you can't just make one big commit and fix the world but we could probably develop tooling to bridge the gap (it's on my list of free time things so we stop abusing LibUp for this...).