Page MenuHomePhabricator

Create a global Maven package registry in Gitlab
Open, HighPublic

Description

We want to migrate away from Archiva and replace it with Gitlab as a Maven package registry. See parent task (T367315) and doc for more details.

We expect to host ~50G of packages over the next 5 years. We will not migrate the historical releases from Archiva to Gitlab.

My understanding is that we can create a global package registry in Gitlab, but if that's not the case, we could create one that is part of the ci-tools group or maybe a dedicated group. We do NOT want to have a registry per repo, and we would like to avoid having multiple registries linked to multiple groups. Having a single registry allow us to have that configuration centralized and transparent to all projects that need to use those dependencies.

Eventually, we want CI to upload artifacts to the registry, so most users should not need write access. During the implementation phase, to allow for experimentation, we want a few users to be able to upload packages manually for testing.

The registry should be readable anonymously by anyone.

AC:

  • Gitlab package registry is created
  • Registry is writable by members of the DPE SRE team
  • Registry is publicly readable

Event Timeline

I like this idea.

Because a repo can host different kinds of packages (npm, pip, maven, etc.), perhaps this global registry repo could be used for more than just maven/java?

I don't think there would be anything different to do for the archiva replacement other than do a little name bikeshedding. Actually using the repo for other things would be in different tasks, e.g. T366612: Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code.

Because a repo can host different kinds of packages (npm, pip, maven, etc.), perhaps this global registry repo could be used for more than just maven/java?

We should probably try to isolate different type of packages. If I understand correctly, Gitlab has native support for a number of package repositories (npm, python, ...).

We should probably try to isolate different type of packages

Possibly! Why do you think so?

IIUC, this repo would mostly be just a place to publish packages. Archiva was also previously used to host python pip packages.

I think each repo can have multiple kinds of package registries. I think (could be wrong here) the registries themselves would still be 'separate', they'd just be contained in one gitlab repo/project.

Yes, if we keep different kind of artifacts separated in different package registries, I have no objection. At the moment, I'd like to focus on the Maven part and hope that we don't abuse it as much as we've abused Archiva!

A potential use case we have:

We'd like to move host eventutilities-python and wikimedia-eventutilities (Java) in the same repo in gitlab, and publish both maven and pip packages from it. These could be configured to publish to different global GitLab repos, but centralizing instructions for how to publish and depend on Wikimedia published packages sounds nice.

At the moment, I'd like to focus on the Maven part and hope that we don't abuse it as much as we've abused Archiva!

Agree. I think the only work-to-do implications of my suggestion are:

  • Investigate the feasibility/desirability of using one global gitlab repo for many types of packages
  • If we want it, then do a little repo name bikeshedding.
Gehel triaged this task as High priority.Thu, Jun 13, 1:16 PM