We run a custom git-fat package, which is in standard packages and deployed on every host. It's written in Python 2, which is EOL. Py3 support isn't completed upstream: https://github.com/jedbrown/git-fat/issues/92, so we could also collaborate with them if we want to continue to use git-fat.
- Mentioned In
- T313250: Bring up gerrit2002
T243027: replacement for gerrit2001, decom gerrit2001
- Mentioned Here
- T243027: replacement for gerrit2001, decom gerrit2001
T313250: Bring up gerrit2002
T235013: Use `git lfs` for large binary files of Design Style Guide
T147856: Scap deploy failed to sync git-fat artifacts
T155856: Package + deploy new version of git-fat
T202100: Intermittent git-fat failure during deploy
T214229: scap3 + git-fat results in git status with permissions errors
I'm not familiar in detail with the current use cases of git-fat, but moving to a supported different tool is probably the better path forward than porting git-fat ourselves. Both git-lfs and git-annex seem like viable alternatives to explore (both are already packaged in Debian)
I can't say for sure specially since it's part of base packages so it could be used anywhere but the only explicit usage is archiva and I hope we can find a usecase to just avoid using that. git-lfs seems to be the industry standard these days.
There is support in scap3 for git-lfs, but it's not used (as far as I'm aware) or well-tested. It *might* already work.
I honestly hadn't touched archiva either. There's a shell script (originally written by @Ottomata judging from git-blame) that moves java jars to the place git-fat expects to find them. Maybe we can just ditch that script and deploy directly from Gerrit (given we have the git-lfs extension for gerrit installed and gitlab has git-lfs support as well).
The last time we talked about git-lfs in detail that I can recall is T235013: Use `git lfs` for large binary files of Design Style Guide
deploy directly from Gerrit
...say more :)
The jar binaries are built by maven-release-plugin in a jenkins job and then uploaded to Archiva using the Archiva API. They are then synced into a git fat repo. Deploy repos then git fat add them, and scap can rsync them (via git fat) to their target hosts on deploy.
archiva-gitfat-link just scans the archiva repository directory for artifact files, and then makes symlinks to them in a git-fat folder named by their shasum, as git fat expects. I'm not familiar with how git-lfs works, but perhaps it can be made to work the same way? Is it an rsync remote?
So rather than build jar files and upload to archiva, we'd build jar files and add .jar to .gitattributes to be managed via git-lfs, then those jars would get stored on the gerrit host. On deployment (or fetch), each target would fetch the jar via a GET request to gerrit (is my rough mental model).
- Changes to maven-release-plugin CI job—Maven supports uploading to archiva, but probably not to lfs (plus we probably don't want repo push creds in CI?)
- Gerrit has a lot of disk space, but how much disk space do we use in archiva?
- How many hosts are deployed in parallel and what kind of load will that put on gerrit?
- Are targets allowed to make outbound connections to gerrit?
- Unknowns around protocol/network traffic changes. Not expecting issues, really, but it's a change.
Can .jar .gitattributes be manged by git-lfs to download from Archiva API directly? E.g. this URL: http://archiva.wikimedia.org/repository/releases/org/wikimedia/analytics/refinery/job/refinery-job/0.1.26/refinery-job-0.1.26-shaded.jar (copied from https://archiva.wikimedia.org/#artifact-details-download-content/org.wikimedia.analytics.refinery.job/refinery-job/0.1.26)