They are mostly reads, and there is no firm indication that the batch requests are overloading Gerrit. But given the load they impose, it might be wise to shift the load to a mirror. They notably do not need to be 100% up to date with the master and could suffer the slight delay incurred by replication.
Relevant extracts from T221026:
From looking at http requests per minute in javamelody, over 1 year, I see that traffic has increased a lot:
https://gerrit.wikimedia.org/r/monitoring?part=graph&graph=httpHitsRate (http hits per minutes):
update yearly view on Sep 25th:
@thcipriani pointed out the mean stays identical, but the max has grown in March 2019 from roughly 4k/minutes to 6k/minutes.
@hashar proposed: Would it make sense to set a readonly replica such as git.wikimedia.org to offload Gerrit? The bots/scripts running on WMCS could be easily made to point to that mirror. And listed:
Out of 623k https requests in April 17th access logs:
|9705||xxxxx||some public internet IP|
Probably codesearch ( https://codesearch.wmflabs.org/ ), Phabricator and extdist ( https://www.mediawiki.org/wiki/Extension:ExtensionDistributor ) could be moved to a use a mirror.
The CI slaves do hammer Gerrit :-/
Note that its for any HTTP request, not just git-upload-pack. But the result is similar when filtering for upload-pack.
Not taken in account, git fetch from the zuul-mergers which are done over ssh with the jenkins-bot user.