While browsing Gerrit logs on logstash, I found out two WMCS instances are doing git fetches against the primary Gerrit every 5 minutes to update some MediaWiki repository.
For some reason, each operation causes an error on the server side:
message | Internal error during upload-pack from /srv/gerrit/git/mediawiki/core.git |
type | org.eclipse.jetty.io.EofException |
thread | HTTP POST /r/mediawiki/core/git-upload-pack |
The traffic comes from:
fedprops-euspecies.wikidata-dev.eqiad1.wikimedia.cloud. | 172.16.2.3 | Created by @Addshore |
wb-reconcile.wikidata-dev.eqiad1.wikimedia.cloud. | 172.16.6.4 | Created by @Lucas_Werkmeister_WMDE |
The Gerrit server side trace indicates the socket got terminated before all data got written by the server:
org.eclipse.jetty.io.EofException at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279) at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422) at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:277) at org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:381) at org.eclipse.jetty.server.HttpConnection$SendCallback.process(HttpConnection.java:804) at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241) at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:223) at org.eclipse.jetty.server.HttpConnection.send(HttpConnection.java:528) at org.eclipse.jetty.server.HttpChannel.sendResponse(HttpChannel.java:915) at org.eclipse.jetty.server.HttpChannel.write(HttpChannel.java:987) ... Caused by: java.io.IOException: Connection reset by peer at java.base/sun.nio.ch.FileDispatcherImpl.writev0(Native Method) at java.base/sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51) ...
It might not be the sole systems or users triggered the issues, but those two instances stand out since they update several repositories every five minutes.
My aim for this task is to get rid of the server side error. How? Well I don't know what is the cause of it.
Things that might help diagnose the issue:
- which commands are used to update the repository (most probably git)
- get the git version being used (git --version)
- we might want to try a newer git version from -backports
- check whether git protocol v2 is turned on (`git config --get protocol.version)
- Can be changed in /etc/gitconfig
We might consider fetching from gerrit-replica.wikimedia.org instead of the primary server. But that is not essential since the queries hit the in memory cache and I don't think they cause any performance trouble on the server.
Acceptance Criteria: 🏕️🌟(August 2021)
- The git-updater does not cause Gerrit server side stack traces