For CI instances managed by Nodepool, we would like to provide a mirroring of the most popular Gerrit git repositories directly in the image.
This way the Jenkins jobs can do the initial git clone from the local filesystem which will uses hard links for the git objects. As an example cloning mediawiki/core takes several minutes to clone.
When creating the reference disk image used by Nodepool, we will clone the most popular / heavier repositories as bare repositories (to save up the disk usage caused by checkout).
Looking on the Gerrit server (ytterbium) under /var/lib/gerrit2/review_site:
$ du -d 1 -m --exclude=./subversion --exclude=./analytics/aggregator/data.git --exclude=./operations/debs 8214 MBytes
But that is overkill. I think it can be tuned down to roughly 4Gbytes including the operating system. Is that manageable by our labs infra?