Zuul repositories have too many refs causing slow updates
Zuul merge operations are quite slow. The reason is that fetches from Gerrit are painfully slow for some repositories:

Under zuul@gallium:/srv/ssd/zuul/git/ :

mediawiki/core$ time git fetch --dry-run

real	0m18.353s
user	0m17.781s
sys	0m0.236s

The operation is quite long because git send all references to the remote. And:

$ git show-ref|fgrep -c refs/zuul

We need a script that list all references matching refs/zuul/* , inspect the commit date and delete the reference it is older than X days (for example 30 days). That will help git fetch operation and thus speed up Zuul merge operations.

to run the job until it is puppetized/packaged

find /srv/ssd/zuul/git/ -name .git -type d -print -exec /home/hashar/ --until 30 {} \;



Example of an operation that took 1m20s:

2014-07-23 21:59:22,755 DEBUG zuul.Repo: Resetting repository /srv/ssd/zuul/git/mediawiki/core
2014-07-23 21:59:22,755 DEBUG zuul.Repo: Updating repository /srv/ssd/zuul/git/mediawiki/core
2014-07-23 22:00:04,755 DEBUG zuul.Repo: Checking out 6466a598a9579db0789055b73001e39a6d7840a5
2014-07-23 22:00:45,412 DEBUG zuul.Repo: Merging refs/changes/46/148846/2 with args ['-s', 'resolve', 'FETCH_HEAD']


That cause a bunch of issues. Will get a script to clean up obsolete references.

I wrote a quick script which inspect the commit pointed by the Zuul reference and delete the reference whenever it is older than a given number of days (default 360).

Proposed upstream as

Will run it on gallium.

zuul@gallium:/srv/ssd/zuul/git/mediawiki/core$ git show-ref|fgrep -c refs/zuul/

Then ran /home/hashar/ --until 360 .

And that dropped roughly 21k references:

$ git show-ref |fgrep -c refs/zuul

Will process operations/puppet as well.

I have cleaned up a few more repositories

For reference, one can find the top 10 offenders by running:

cd /srv/ssd/zuul/git
find . -type d -name .git -exec bash -c 'echo -n "{}:"; git --git-dir {} show-ref|fgrep -c refs/zuul' \; | sort -nr -k2 -t: | head -n10

Lowering priority since the ref have been dealt with. Have to get Zuul fixed to garbage collect old references automatically.

Someone has bring the topic on the openstack-infra mailling list. So I followed up on the reviews that were pending on and wrote some basic documentation. That would help get it merged in I guess :-]

The task should be kept open until zuul-merger learns to garbage collect old references automatically. --verbose --dry-run --until 90 /srv/zuul/git/project

I am not actively working on this. See list of blockers to make the clean up automatic.

This task detail has the long command to run on gallium as zuul user:

sudo -u zuul find /srv/ssd/zuul/git/ -name .git -type d -print -exec /home/hashar/ --until 30 {} \;

The patch I have proposed upstream has been approved :-}

We would want to include the utility in the Zuul Debian package then add some puppet cruft to have it run in a cron on a weekly(?) basis.

Our .deb package is up-to-date and include the utility. It has a race condition though which I have detailed in T103528

Something somehow got enhanced and it is way faster nowadays. Either due to gallium disk that was slow, network, better Gerrit, optimizations of git or whatever.

$ git ls-remote .|grep -c refs/zuul

$ git remote -v
origin	ssh:// (fetch)
origin	ssh:// (push)

$ time git fetch --dry-run

real	0m2.808s
user	0m2.352s
sys	0m0.376s