Page MenuHomePhabricator

Cache submodules and use --reference to save space
ClosedPublic

Authored by mmodell on Oct 18 2017, 10:33 PM.
Referenced Files
Unknown Object (File)
Wed, Nov 22, 5:51 AM
Unknown Object (File)
Sat, Nov 11, 5:07 PM
Unknown Object (File)
Nov 5 2023, 11:49 PM
Unknown Object (File)
Nov 1 2023, 4:05 AM
Unknown Object (File)
Oct 30 2023, 2:55 PM
Unknown Object (File)
Oct 27 2023, 11:06 AM
Unknown Object (File)
Oct 24 2023, 8:59 PM
Unknown Object (File)
Oct 24 2023, 2:54 PM
Subscribers
None

Details

Maniphest Tasks
T137124: Scap3 submodule space issues
Reviewers
demon
thcipriani
hashar
dduvall
Group Reviewers
Release-Engineering-Team
Commits
rMSCAccea24641f77: Cache submodules and use --reference to save space
Patch without arc
git checkout -b D826 && curl -L https://phabricator.wikimedia.org/D826?download=true | git apply
Summary

Requires git 2.11, which we should have everywhere.

The new behavior is to caches the submodules in deploy-cache/cache/modules/, then
when cloning to revs/$rev/ we use --recurse-submodules and --reference ../cache/
then git does the magic to make the clone's submodules reuse the cached objects.

Disk usage, using rPHDEP as an example.

Cache modules

$ du -hs cache/.git/modules/
121M    cache/.git/modules/

Checkout in revs/

$ du -hs revs/test/..git/modules
2.6M    revs/test/.git/modules
Test Plan

Currently untested. I'd like to merge this and test in beta.

Diff Detail

Repository
rMSCA Scap
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Harbormaster completed remote builds in Restricted Buildable.Oct 18 2017, 10:35 PM
scap/deploy.py
276

Update submodules in the cache instead of the rev

304–306

This will take care of referencing the objects from cache and the rev/.git/modules will be tiny!

mmodell retitled this revision from WIP: cache submodules and use --reference to save space to Cache submodules and use --reference to save space.Oct 19 2017, 6:10 PM
mmodell edited the test plan for this revision. (Show Details)

Probably fine, at least for testing in beta. Nitpick about performance inline.

scap/git.py
324

For repos with a sufficiently high number of submodules, we'd benefit from using --jobs

We do a lot of this "find a sane number of processors to fork to" logic, we should probably have a function in utils for that.

This revision is now accepted and ready to land.Oct 19 2017, 6:15 PM

Use the cpus_for_jobs function from D828

Harbormaster completed remote builds in Restricted Buildable.Oct 19 2017, 10:55 PM

one more call to cpus_for_jobs

Harbormaster completed remote builds in Restricted Buildable.Oct 19 2017, 10:57 PM
This revision was automatically updated to reflect the committed changes.