Page MenuHomePhabricator

Gerrit upload-pack send ALL references causing massive network I/O on common operations
Closed, ResolvedPublic

Description


This issue has been fixed by Google which introduced a version 2 git protocol aiming at solving that exact same issue. See below T103990#6268632


I found out my git fetch on operations/puppet.git is slightly too slow:

Over ssh that is 40 seconds:

$ time git fetch ssh://hashar@gerrit.wikimedia.org:29418/operations/puppet.git
From ssh://gerrit.wikimedia.org:29418/operations/puppet
 * branch            HEAD       -> FETCH_HEAD

real	0m39.609s
user	0m0.176s
sys	0m0.219s

Over https 20 seconds:

$ time git fetch https://gerrit.wikimedia.org/r/p/operations/puppet.git
From https://gerrit.wikimedia.org/r/p/operations/puppet
 * branch            HEAD       -> FETCH_HEAD

real	0m19.165s
user	0m0.438s
sys	0m0.120s

It is network bound. Using git debug env variables GIT_TRACE and GIT_TRACE_PACKET, git-fetch executes git upload-pack on Gerrit server which yields ALL references including refs/changes/*. That looks like:

$ GIT_TRACE=1 GIT_TRACE_PACKET=1  git fetch 2>&1| head -n25
trace: built-in: git 'fetch'
trace: run_command: 'ssh' '-p' '29418' 'hashar@gerrit.wikimedia.org' 'git-upload-pack '\''/operations/puppet.git'\'''

packet:        fetch< 874c4155a4f831a2331b9f6cb8c2ac806c03f603 HEAD\0 include-tag multi_ack_detailed multi_ack ofs-delta side-band side-band-64k thin-pack no-progress shallow 
packet: fetch< 9126e0b refs/cache-automerge/01/7f9d81411d3a534d7868cd251bc5d04a00524a
packet: fetch< 48d1474 refs/cache-automerge/01/b66390dd5f6df1bc245971482f43dd59dfcd4f
packet: fetch< c8bbf14 refs/cache-automerge/02/84485a2b85cc49d46166bedea485e8716f4034
packet: fetch< 32066cb refs/cache-automerge/03/ad7228e9bd88c7526893791499ea068a7fcf32
packet: fetch< 6eb986b refs/cache-automerge/05/e646c89fdc620fb2757775c2a5e9a7b6031a8d
packet: fetch< 4b825dc refs/cache-automerge/08/17d532d8576761eb14087332a343c7e86ec937
packet: fetch< 4b825dc refs/cache-automerge/08/218590a6fe5590ee1e3a3189618b04daf40692
packet: fetch< 0a6d437 refs/cache-automerge/0a/7b3958b1f6aeab72b8fe766bcde6bc245b9d08
packet: fetch< 3cef6ee refs/cache-automerge/0a/faf2dcd52226091daa0ac503dab15186338cf1
packet: fetch< 127c9fa refs/cache-automerge/10/0e2dc523f8721d28f1b57de40c111dea442ccd
packet: fetch< e40a942 refs/cache-automerge/12/0ff0a8ec28981ea82fa824253f58cfd1a9f466
packet: fetch< c6dea31 refs/cache-automerge/12/c00202d713973ebf1a567f969b710b03d1711b
packet: fetch< 14c2007 refs/cache-automerge/17/2866932fda4cce417aca2ddf72b62386e095f5
packet: fetch< 9126e0b refs/cache-automerge/1e/ad36f47754cd0e2788a48573451f54553c7612
packet: fetch< c48b56a refs/cache-automerge/23/f2299e57e144409fa120d82c69b0c7040dbca7
packet: fetch< 9126e0b refs/cache-automerge/24/ba68d86933de429a330eff4eae018dc71140de
packet: fetch< 5790b5b refs/cache-automerge/26/ac09999377a12f0e4d69f09dfe37b4b3342db4
packet: fetch< 22e5a3a refs/cache-automerge/2a/387f4a1fff03e9500ec8186f269e083907e8d4
packet: fetch< 4596d1f refs/cache-automerge/2e/3ad288bd263792ab2980c60c68cb47fe01bfb1
packet: fetch< efc147b refs/cache-automerge/2e/f5d586e86a7dd4e06e7c54e77753dd101476db
packet: fetch< 69afa3b refs/cache-automerge/33/77e76f8918d76ba014811e941e20e048879c3d
packet: fetch< 2139785 refs/cache-automerge/36/d80bf5d1b574c01c54abe8fdf0d225ed5a8f71
....
// then the refs/changes/*

That is a major burden when on a slow network connection and probably add stress on our Gerrit server.


From the C git manual, we can hide references from the initial advertisement of references:

uploadpack.hideRefs
String(s) upload-pack uses to decide which refs to omit from its initial advertisement. Use more than one definitions to specify multiple prefix strings.
A ref that are under the hierarchies listed on the value of this variable is excluded, and is hidden from git ls-remote, git fetch, etc. An attempt to fetch a hidden ref by git fetch will fail. See also uploadpack.allowtipsha1inwant.

uploadpack.allowtipsha1inwant
When uploadpack.hideRefs is in effect, allow upload-pack to accept a fetch request that asks for an object at the tip of a hidden ref (by default, such a request is rejected). see also uploadpack.hideRefs.

In theory we might want to set on all repositories:

uploadpack.hiderefs=refs/changes
uploadpack.hiderefs=refs/cache-automerge
uploadpack.allowtipsha1inwant=true

That was apparently introduced in jgit with e747517 which is apparently included since jgit v3.1.0.201309270735-rc1.

So that should be included in the version we have bundled:

$ unzip -l /var/lib/gerrit2/review_site/bin/gerrit.war|grep -i jgit
     4540  2014-07-01 18:25   WEB-INF/lib/gerrit-patch-jgit-server.jar
    81372  2014-07-01 18:25   WEB-INF/lib/org.eclipse.jgit.http.server-3.1.0.201310021548-r.jar
  1771070  2014-07-01 18:25   WEB-INF/lib/org.eclipse.jgit-3.1.0.201310021548-r.jar

I found a bug filled for Gerrit at https://code.google.com/p/gerrit/issues/detail?id=175

Related Objects

StatusSubtypeAssignedTask
Resolvedhashar
ResolvedQChris
ResolvedQChris
ResolvedDzahn
Resolvedthcipriani
ResolvedAklapper
ResolvedAklapper
ResolvedQChris
ResolvedDzahn
ResolvedQChris
ResolvedQChris
Resolvedjcrespo
ResolvedDzahn
ResolvedPaladox
ResolvedQChris
ResolvedQChris
ResolvedQChris
ResolvedQChris
ResolvedQChris
ResolvedQChris
ResolvedQChris
ResolvedQChris
ResolvedQChris
ResolvedBUG REPORTQChris
InvalidBUG REPORTNone
ResolvedDzahn
ResolvedBUG REPORTkostajh
ResolvedBUG REPORTQChris
ResolvedBUG REPORTQChris
InvalidBUG REPORTNone
ResolvedBUG REPORTQChris
DeclinedNone
InvalidNone
OpenNone
DuplicateBUG REPORTNone
DeclinedNone
ResolvedNikerabbit
DeclinedBUG REPORTNone
Resolvedhashar
InvalidNone
ResolvedQChris
DeclinedNone
DeclinedNone
DuplicateNone
DuplicateNone
ResolvedBUG REPORTQChris
ResolvedQChris
ResolvedQChris
DeclinedNone
OpenNone
ResolvedDzahn
ResolvedQChris

Event Timeline

hashar raised the priority of this task from to Needs Triage.
hashar updated the task description. (Show Details)
hashar added a project: Gerrit.
hashar subscribed.

Found a reference in the Gerrit IRC channel from August 2013 at http://echelog.com/logs/browse/gerrit/1377036000

mordred is from OpenStack infrastructure, spearce is Gerrit original author:

[20:45:57] <mordred> spearce: ola! mildly wrong channel, but I have a question about a 3-year-old patch from you: http://thread.gmane.org/gmane.comp.version-control.git/126797/focus=127059
[20:46:51] <mordred> spearce: which does not appear to have ever landed, and if there is a way to avoid advertising refs/changes/* to clients in a non-gerrit replica of gerrit repos
[21:02:13] <dborowitz> mordred: see receive.hiderefs in recent versions of git
[21:07:44] <mordred> dborowitz: so, that seems to be to filter things from the push side, not filtering things from the initial server advertisment for a fetch
[21:08:02] <mordred> but ooooh! uploadpack.hiderefs
[21:08:03] <mordred> yay
[21:08:03] <dborowitz> sorry, transfer.hiderefs
[21:08:07] <mordred> yes
[21:08:09] <dborowitz> yep
[21:08:09] <mordred> perfect

Another one from opentack-infra, writing it down for later reference http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2013-08-21.log

hashar set Security to None.
hashar updated the task description. (Show Details)

While I see the issue that you're describing, hiderefs on its own will not be sufficient here. Not even with allowtipsha1inwant. I do not think there is an easy solution.

hiderefs handles the hiding of refs. True.
But how would you fetch a commit from gerrit then?

If for example the ref changes/75/221775/1 is hidden through hiderefs, it is no longer advertized. But on the flip side a plain fetch

git fetch ssh://qchris@gerrit.wikimedia.org:29418/operations/puppet refs/changes/75/221775/1

would fail. So git-review would fail. Most other tools would fail. And probably most people's workflows would break. :-(

allowtipsha1inwant does not render the ref fetchable for me (neither using git 1.8.1.5, nor using git 2.0.5). :-(((

The only way to fetch the commit is using a recent git, and directly fetching the commit's hash. Like

git fetch ssh://qchris@gerrit.wikimedia.org:29418/operations/puppet 316f1a3f569b02070087f35e8b9194c3856c4247

. But that on the one hand requires a recent git, and on the other requires to know the commit hash beforehand.

(If we end up using hideref nonetheless, we need to have the hiderefs values [[https://github.com/eclipse/jgit/blob/e74751769e7be0901d1190e789bf360512ff3213/org.eclipse.jgit/src/org/eclipse/jgit/transport/TransferConfig.java#L120 | end in /to trigger prefix matches]].)

Just to connect things ... the issue has been brought up on the gerrit mailing list today:
https://groups.google.com/forum/?hl=en#!topic/repo-discuss/JwDmV6qcvy0

Thanks @QChris . I have poked the OpenStack infrastructure team which talked that issue 2 years ago. That was just to let them know though :-]

git 2.5.0 has:

"git upload-pack" that serves "git fetch" can be told to serve commits that are not at the tip of any ref, as long as they are reachable from a ref, with uploadpack.allowReachableSHA1InWant configuration variable.

That might covers the hideRefs issue @QChris mentioned above. I posted some lame summary on Gerrit issue https://code.google.com/p/gerrit/issues/detail?id=175#c17

Btw. after this re-newed discussion on the gerrit lists, gerrit's download-commands plugin received a config setting to expose SHA1s instead of refs in fetch, cherry-pick,... commands shown from the change screen.

So with gerrit master, one could already achieve what this task is asking for.

hashar triaged this task as Medium priority.Sep 10 2015, 9:28 AM
greg lowered the priority of this task from Medium to Low.Sep 10 2015, 11:05 PM

TL;DR: use http for fetches

When doing a fetch, git-uploadpack is executed on the remote server (in our case Gerrit) and the first thing the command does is emitting the list of all of its references with their sha1. In the C source code that is known has advertise_refs and JGit (which Gerrit uses) does the same.

Easy way to time it:

time ssh -p 29418 gerrit.wikimedia.org git-upload-pack mediawiki/core < /dev/null > /dev/null

real	0m17.352s
user	0m0.085s
sys	0m0.115s

Interestingly, fetching from Gerrit over HTTP is way faster:

HTTP

$ time git fetch https://gerrit.wikimedia.org/r/p/operations/puppet.git
From https://gerrit.wikimedia.org/r/p/operations/puppet
 * branch            HEAD       -> FETCH_HEAD

real	0m2.412s
user	0m0.185s
sys	0m0.090s

SSH

$ time git fetch ssh://hashar@gerrit.wikimedia.org:29418/operations/puppet.git
From ssh://gerrit.wikimedia.org:29418/operations/puppet
 * branch            HEAD       -> FETCH_HEAD

real	0m20.769s
user	0m0.173s
sys	0m0.252s

I would blame Gerrit JGit over ssh for being slow at advertising references it knows. The workaround is thus to use http for fetches.

Can be tweaked with 'git remote set-url':

$ git remote set-url origin https://gerrit.wikimedia.org/r/p/operations/puppet.git
$ git remote set-url --push origin ssh://gerrit.wikimedia.org:29418/operations/puppet.git
$  git remote -v
origin	https://gerrit.wikimedia.org/r/p/operations/puppet.git (fetch)
origin	ssh://hashar@gerrit.wikimedia.org:29418/operations/puppet.git (push)

For people using git-review you will want to install it from their tip of their branch which will eventually become 1.26. That comes with https://review.openstack.org/#/c/109869/ which let the command to reuse the 'origin' remote instead of creating a second remote named gerrit. That set the push url to SSH.

The feature is opt-int, so one would need in git-config:

[gitreview]
	usepushurl = 1

Then:

  • set-url to point to http or clone over http
  • git-review -s ( sets the push URL)
  • git remote rm gerrit # clean up now unneeded remote

Interestingly, fetching from Gerrit over HTTP is way faster:

Not always: T58034#617927

From https://bugs.chromium.org/p/gerrit/issues/detail?id=175

With https://gerrit-review.googlesource.com/#/c/72258/, Gerrit 2.12 adopted JGit 4.1. This means that the allowReachableSHA1InWant configuration can be used (have not tested it though).

https://gerrit-review.googlesource.com/#/c/72259/ updates the download plugin to read that configuration as well.

Gotta investigate at packhiderefs/allowtipsha1inwant/allowreacheablesha1inwant settings and from the task description:

In theory we might want to set on all repositories:

uploadpack.hiderefs=refs/changes
uploadpack.hiderefs=refs/cache-automerge
uploadpack.allowtipsha1inwant=true

Hi,

Does this solution work for you? I tried settings these, but it didn't change anything. I'm using latest gerrit 2.13.

Ok, I found out that I need a trailing / on each of them.

By the way, http is faster probably because it uses gzip compression.

+ @demon

The summary is that GIT_TRACE_PACKET=1 git fetch shows an initial advertisement of all of refs/changes/* which are usually not needed when just fetching heads. On repos having a lot of change that severely delay the commands operations when on a slow network connection.

The trick is to configure the git repos to hide those refs entirely (uploadpack.HideRefs=refs/changes/, but still let them get fetched when explicitly asked (uploadpack.allowTipSha1InWant=1).

Has to be done on the service side. We probably don't need to add that on all repositories, but the biggest / heavy traffic one would definitely benefit from it.


From my comment on Gerrit issue #175:

Edwin Kempin has documented download.checkForHiddenChangeRefs and hints at uploadpack.hideRefs and uploadpack.allowTipSha1InWant https://gerrit-review.googlesource.com/#/c/69280/ . That is in Gerrit 2.12.

I haven't had the opportunity to test the git config yet but Comment 20 / 21 just above seems to indicate that it would work. The repositories have to be configured with:

[uploadpack]
  hideRefs = refs/changes/
  hideRefs = refs/cache-automerge/
  allowTipSha1InWant = true

Reference in git documentation https://git-scm.com/docs/git-config#git-config-uploadpackhideRefs

A config change need to occur in Gerrit https://gerrit.wikimedia.org/r/Documentation/config-gerrit.html#download.checkForHiddenChangeRefs

download.checkForHiddenChangeRefs

Whether the download commands should be adapted when the change refs are hidden.

Git has a configuration option to hide refs from the initial advertisement (uploadpack.hideRefs). This option can be used to hide the change refs from the client. As consequence fetching changes by change ref does not work anymore. However by setting uploadpack.allowTipSha1InWant to true fetching changes by commit ID is possible. If download.checkForHiddenChangeRefs is set to true the git download commands use the commit ID instead of the change ref when a project is configured like this.

By default False

Chad can we give it a try on one of our repos? Can do on integration/config which is fairly isolated.

I encountered a similiar issue today when trying to fetch submodules with a depth of 1. Git would fetch the submodule sha1 and it aborts:

$ git clone --quiet --depth=1 https://gerrit.wikimedia.org/r/p/integration/jenkins
$ cd jenkins
$ GIT_TRACE=1 git submodule update --init --depth=1 tools/mwcodeutils
...
trace: built-in: git 'fetch' 'origin' '--depth=1' '708a35d0121703257c5ab938d2bed92de1aef8b8'
...
error: no such remote ref 708a35d0121703257c5ab938d2bed92de1aef8b8
Fetched in submodule path 'tools/mwcodeutils', but it did not contain 708a35d0121703257c5ab938d2bed92de1aef8b8. Direct fetching of that commit failed.

The commit is https://gerrit.wikimedia.org/r/#/c/59088/ and thus as an advertised reference refs/changes/88/59088/1, but it is not the tip of the branch.

Seems that specific issue can be solved by setting in Gerrit:

uploadpack.allowReachableSHA1InWant=true

git client before 2.14.0 might also need uploadpack.allowTipSHA1InWant=true.

https://stackoverflow.com/questions/31278902/

google has made git protocol v2 and open sourced it and will be available in 2.18. gerrit jgit upstream is gaining support for this new protocol.

https://opensource.googleblog.com/2018/05/introducing-git-protocol-version-2.html

hashar changed the task status from Open to Stalled.Jan 1 2019, 3:16 PM

Git protocol 2 support has been added tentatively in Gerrit 2.16 but eventually got withdrawn with 2.16.2:

2.16.2

SECURITY Issue 10201: Remove support for Git protocol v2, because of a security vulnerability discovered

The JGit implementation of protocol V2 does not invoke the advertiseRefsHook on fetch and ls-refs, which results in all refs being sent, regardless of the configured ACLs.

:(

Git protocol 2 support has been added tentatively in Gerrit 2.16 but eventually got withdrawn with 2.16.2:

Is there any information on whether/when it’s coming back? The GerritForge blog post promises that it’s “coming back to Gerrit v2.16 very soon”, but I can’t find any trace of it in the release notes of the 2.16 line after 2.16.2, and it’s not in the 3.0.0 configuration documentation either.

Gerrit 2.16 still supports Git protocol 2 (they just disabled it by default rather then enabled it by default). So we can enable once we upgrade.

I’m wrong, it was removed from gerrit, but being re added in https://gerrit-review.googlesource.com/c/gerrit/+/226754

TL;DR: use http for fetches

BTW there's a much easier way to do this from your global ~/.gitconfig without a need to run git remote set-url commands in each repo.

Assuming your gerrit username is the same as your local shell username $USER:

cat >>.gitconfig <<EOF

# Always fetch over HTTPS and push over SSH.
[url "https://gerrit.wikimedia.org/r/"]
	insteadOf = ssh://${USER}@gerrit.wikimedia.org:29418/
[url "ssh://${USER}@gerrit.wikimedia.org:29418/"]
	pushInsteadOf = https://gerrit.wikimedia.org/r/
# This looks mysterious, but is what stops a cloned-via-SSH repo
# from being rewritten to HTTPS for push.
[url "ssh://${USER}@gerrit.wikimedia.org:29418/"]
	pushInsteadOf = ssh://${USER}@gerrit.wikimedia.org:29418/
EOF

On the Gerrit 3.2 server gerrit-test.wikimedia.org, protocol v2 is definitely faster:

$ git remote -v
origin	https://gerrit-test.wikimedia.org/r/operations/puppet.git (fetch)

$ time git -c protocol.version=0 fetch
real	0m15,768s

$ time git -c protocol.version=2 fetch
real	0m1,083s  # Hurrah

Though I fetch from a URL it is actually slower! With git 2.20.1 or 2.26.2:

$ time git -c protocol.version=0 fetch https://gerrit-test.wikimedia.org/r/operations/puppet.git
real	0m14,478s

$ time git -c protocol.version=2 fetch https://gerrit-test.wikimedia.org/r/operations/puppet.git
real	0m21,297s  # Oh no

Which sounds like a bug in git :-\ I have checked with git master branch and that is the same issue bah

I guess that is working as intended, it fetch every single objects when there is no refspec set. A remote does have a default set to +refs/heads/*:refs/remotes/origin/*. Not a big deal.

Change 608013 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] dockerfiles: abstract common git operations

https://gerrit.wikimedia.org/r/608013

Change 608013 merged by jenkins-bot:
[integration/config@master] dockerfiles: abstract common git operations

https://gerrit.wikimedia.org/r/608013

hashar claimed this task.
hashar edited projects, added git-protocol-v2; removed Patch-For-Review.

This has been solved by Google proposing a new version of the git protocol. https://opensource.googleblog.com/2018/05/introducing-git-protocol-version-2.html

With our upgrade of Gerrit to 3.2 on June 27th 2020, we now support git protocol version 2.

cgit client had it in 2.18, has then further been improved and 2.26 turns it on by default.

On pre 2.26 client, one can turn the protocol on with: git -c protocol.version=2 or add to their ~/.gitconfig:

[protocol]
	version = 2

Adoption of the protocol at Wikimedia will be further tracked by the project goal git-protocol-v2 (shorthand: #gitv2).