Page MenuHomePhabricator

Evaluate impact of giant VM images on labs infrastructure
Closed, ResolvedPublic

Description

For CI instances managed by Nodepool, we would like to provide a mirroring of the most popular Gerrit git repositories directly in the image.

This way the Jenkins jobs can do the initial git clone from the local filesystem which will uses hard links for the git objects. As an example cloning mediawiki/core takes several minutes to clone.

When creating the reference disk image used by Nodepool, we will clone the most popular / heavier repositories as bare repositories (to save up the disk usage caused by checkout).

Looking on the Gerrit server (ytterbium) under /var/lib/gerrit2/review_site:

$ du -d 1 -m --exclude=./subversion --exclude=./analytics/aggregator/data.git --exclude=./operations/debs
8214 MBytes

But that is overkill. I think it can be tuned down to roughly 4Gbytes including the operating system. Is that manageable by our labs infra?

Event Timeline

hashar assigned this task to Andrew.
hashar raised the priority of this task from to Medium.
hashar updated the task description. (Show Details)
hashar added subscribers: gerritbot, hashar, Krinkle, Aklapper.

Per discussion with @Andrew when I have filled the task: lets assume good faith and try to keep them small.

Caching all the repositories on Nodepool instances takes too much disk space so we would only mirror the most used ones.