Page MenuHomePhabricator

Gerrit timeout when cloning mediawiki/core
Open, Needs TriagePublic

Description

@zeljkofilipin reported a similar one using:

GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git clone --depth=1 https://gerrit.wikimedia.org/r/mediawiki/core.git mediawiki

1~$ GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git clone --depth=1 https://gerrit.wikimedia.org/r/mediawiki/core.git mediawiki
2Cloning into 'mediawiki'...
318:11:59.377481 http.c:731 == Info: Couldn't find host gerrit.wikimedia.org in the .netrc file; using defaults
418:11:59.535149 http.c:731 == Info: Trying 208.80.154.137...
518:11:59.535178 http.c:731 == Info: TCP_NODELAY set
618:11:59.679604 http.c:731 == Info: Connected to gerrit.wikimedia.org (208.80.154.137) port 443 (#0)
718:11:59.679739 http.c:731 == Info: ALPN, offering h2
818:11:59.679748 http.c:731 == Info: ALPN, offering http/1.1
918:11:59.687143 http.c:731 == Info: successfully set certificate verify locations:
1018:11:59.687156 http.c:731 == Info: CAfile: /etc/ssl/cert.pem
11 CApath: none
1218:11:59.687245 http.c:731 == Info: TLSv1.2 (OUT), TLS handshake, Client hello (1):
1318:11:59.828884 http.c:731 == Info: TLSv1.2 (IN), TLS handshake, Server hello (2):
1418:11:59.829086 http.c:731 == Info: TLSv1.2 (IN), TLS handshake, Certificate (11):
1518:11:59.849901 http.c:731 == Info: TLSv1.2 (IN), TLS handshake, Server key exchange (12):
1618:11:59.850401 http.c:731 == Info: TLSv1.2 (IN), TLS handshake, Server finished (14):
1718:11:59.850696 http.c:731 == Info: TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
1818:11:59.850705 http.c:731 == Info: TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
1918:11:59.850742 http.c:731 == Info: TLSv1.2 (OUT), TLS handshake, Finished (20):
2018:11:59.994151 http.c:731 == Info: TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
2118:11:59.994282 http.c:731 == Info: TLSv1.2 (IN), TLS handshake, Finished (20):
2218:11:59.994311 http.c:731 == Info: SSL connection using TLSv1.2 / ECDHE-ECDSA-CHACHA20-POLY1305
2318:11:59.994325 http.c:731 == Info: ALPN, server accepted to use http/1.1
2418:11:59.994331 http.c:731 == Info: Server certificate:
2518:11:59.994342 http.c:731 == Info: subject: CN=gerrit.wikimedia.org
2618:11:59.994348 http.c:731 == Info: start date: Aug 31 17:00:16 2021 GMT
2718:11:59.994353 http.c:731 == Info: expire date: Nov 29 17:00:15 2021 GMT
2818:11:59.994369 http.c:731 == Info: subjectAltName: host "gerrit.wikimedia.org" matched cert's "gerrit.wikimedia.org"
2918:11:59.994379 http.c:731 == Info: issuer: C=US; O=Let's Encrypt; CN=R3
3018:11:59.994383 http.c:731 == Info: SSL certificate verify ok.
3118:11:59.994480 http.c:678 => Send header, 0000000228 bytes (0x000000e4)
3218:11:59.994489 http.c:690 => Send header: GET /r/mediawiki/core.git/info/refs?service=git-upload-pack HTTP/1.1
3318:11:59.994493 http.c:690 => Send header: Host: gerrit.wikimedia.org
3418:11:59.994496 http.c:690 => Send header: User-Agent: git/2.24.3 (Apple Git-128)
3518:11:59.994500 http.c:690 => Send header: Accept: */*
3618:11:59.994503 http.c:690 => Send header: Accept-Encoding: deflate, gzip
3718:11:59.994506 http.c:690 => Send header: Pragma: no-cache
3818:11:59.994509 http.c:690 => Send header: Git-Protocol: version=2
3918:11:59.994512 http.c:690 => Send header:
4018:12:00.608040 http.c:678 <= Recv header, 0000000017 bytes (0x00000011)
4118:12:00.608081 http.c:690 <= Recv header: HTTP/1.1 200 OK
4218:12:00.608088 http.c:678 <= Recv header, 0000000037 bytes (0x00000025)
4318:12:00.608092 http.c:690 <= Recv header: Date: Fri, 08 Oct 2021 16:12:00 GMT
4418:12:00.608098 http.c:678 <= Recv header, 0000000016 bytes (0x00000010)
4518:12:00.608102 http.c:690 <= Recv header: Server: Apache
4618:12:00.608106 http.c:678 <= Recv header, 0000000074 bytes (0x0000004a)
4718:12:00.608118 http.c:690 <= Recv header: Strict-Transport-Security: max-age=106384710; includeSubDomains; preload
4818:12:00.608123 http.c:678 <= Recv header, 0000000023 bytes (0x00000017)
4918:12:00.608127 http.c:690 <= Recv header: X-Frame-Options: DENY
5018:12:00.608131 http.c:678 <= Recv header, 0000000040 bytes (0x00000028)
5118:12:00.608134 http.c:690 <= Recv header: Expires: Tue, 01 Jan 1980 00:00:00 GMT
5218:12:00.608138 http.c:678 <= Recv header, 0000000018 bytes (0x00000012)
5318:12:00.608142 http.c:690 <= Recv header: Pragma: no-cache
5418:12:00.608146 http.c:678 <= Recv header, 0000000053 bytes (0x00000035)
5518:12:00.608150 http.c:690 <= Recv header: Cache-Control: no-cache, max-age=0, must-revalidate
5618:12:00.608154 http.c:678 <= Recv header, 0000000059 bytes (0x0000003b)
5718:12:00.608158 http.c:690 <= Recv header: Content-Type: application/x-git-upload-pack-advertisement
5818:12:00.608163 http.c:678 <= Recv header, 0000000020 bytes (0x00000014)
5918:12:00.608166 http.c:690 <= Recv header: Content-Length: 62
6018:12:00.608170 http.c:678 <= Recv header, 0000000043 bytes (0x0000002b)
6118:12:00.608174 http.c:690 <= Recv header: Backend-Timing: D=1688 t=1633709520077989
6218:12:00.608178 http.c:678 <= Recv header, 0000000002 bytes (0x00000002)
6318:12:00.608182 http.c:690 <= Recv header:
6418:12:00.608197 http.c:731 == Info: Connection #0 to host gerrit.wikimedia.org left intact
6518:12:00.608388 http.c:731 == Info: Couldn't find host gerrit.wikimedia.org in the .netrc file; using defaults
6618:12:00.608424 http.c:731 == Info: Found bundle for host gerrit.wikimedia.org: 0x7ff7d6d04a80 [can pipeline]
6718:12:00.608429 http.c:731 == Info: Could pipeline, but not asked to!
6818:12:00.608444 http.c:731 == Info: Re-using existing connection! (#0) with host gerrit.wikimedia.org
6918:12:00.608451 http.c:731 == Info: Connected to gerrit.wikimedia.org (208.80.154.137) port 443 (#0)
7018:12:00.608516 http.c:678 => Send header, 0000000300 bytes (0x0000012c)
7118:12:00.608524 http.c:690 => Send header: POST /r/mediawiki/core.git/git-upload-pack HTTP/1.1
7218:12:00.608530 http.c:690 => Send header: Host: gerrit.wikimedia.org
7318:12:00.608534 http.c:690 => Send header: User-Agent: git/2.24.3 (Apple Git-128)
7418:12:00.608537 http.c:690 => Send header: Accept-Encoding: deflate, gzip
7518:12:00.608541 http.c:690 => Send header: Content-Type: application/x-git-upload-pack-request
7618:12:00.608545 http.c:690 => Send header: Accept: application/x-git-upload-pack-result
7718:12:00.608548 http.c:690 => Send header: Git-Protocol: version=2
7818:12:00.608551 http.c:690 => Send header: Content-Length: 122
7918:12:00.608554 http.c:690 => Send header:
8018:12:00.608560 http.c:731 == Info: upload completely sent off: 122 out of 122 bytes
8118:12:00.897137 http.c:678 <= Recv header, 0000000017 bytes (0x00000011)
8218:12:00.897157 http.c:690 <= Recv header: HTTP/1.1 200 OK
8318:12:00.897163 http.c:678 <= Recv header, 0000000037 bytes (0x00000025)
8418:12:00.897167 http.c:690 <= Recv header: Date: Fri, 08 Oct 2021 16:12:00 GMT
8518:12:00.897172 http.c:678 <= Recv header, 0000000016 bytes (0x00000010)
8618:12:00.897176 http.c:690 <= Recv header: Server: Apache
8718:12:00.897180 http.c:678 <= Recv header, 0000000074 bytes (0x0000004a)
8818:12:00.897184 http.c:690 <= Recv header: Strict-Transport-Security: max-age=106384710; includeSubDomains; preload
8918:12:00.897188 http.c:678 <= Recv header, 0000000023 bytes (0x00000017)
9018:12:00.897192 http.c:690 <= Recv header: X-Frame-Options: DENY
9118:12:00.897206 http.c:678 <= Recv header, 0000000040 bytes (0x00000028)
9218:12:00.897210 http.c:690 <= Recv header: Expires: Tue, 01 Jan 1980 00:00:00 GMT
9318:12:00.897224 http.c:678 <= Recv header, 0000000018 bytes (0x00000012)
9418:12:00.897237 http.c:690 <= Recv header: Pragma: no-cache
9518:12:00.897243 http.c:678 <= Recv header, 0000000053 bytes (0x00000035)
9618:12:00.897247 http.c:690 <= Recv header: Cache-Control: no-cache, max-age=0, must-revalidate
9718:12:00.897252 http.c:678 <= Recv header, 0000000052 bytes (0x00000034)
9818:12:00.897256 http.c:690 <= Recv header: Content-Type: application/x-git-upload-pack-result
9918:12:00.897260 http.c:678 <= Recv header, 0000000045 bytes (0x0000002d)
10018:12:00.897264 http.c:690 <= Recv header: Backend-Timing: D=143034 t=1633709520688117
10118:12:00.897269 http.c:678 <= Recv header, 0000000028 bytes (0x0000001c)
10218:12:00.897281 http.c:690 <= Recv header: Transfer-Encoding: chunked
10318:12:00.897285 http.c:678 <= Recv header, 0000000002 bytes (0x00000002)
10418:12:00.897288 http.c:690 <= Recv header:
10518:12:01.181998 http.c:731 == Info: Connection #0 to host gerrit.wikimedia.org left intact
10618:12:01.186693 http.c:731 == Info: Couldn't find host gerrit.wikimedia.org in the .netrc file; using defaults
10718:12:01.186712 http.c:731 == Info: Found bundle for host gerrit.wikimedia.org: 0x7ff7d6d04a80 [can pipeline]
10818:12:01.186717 http.c:731 == Info: Could pipeline, but not asked to!
10918:12:01.186735 http.c:731 == Info: Re-using existing connection! (#0) with host gerrit.wikimedia.org
11018:12:01.186743 http.c:731 == Info: Connected to gerrit.wikimedia.org (208.80.154.137) port 443 (#0)
11118:12:01.186796 http.c:678 => Send header, 0000000300 bytes (0x0000012c)
11218:12:01.186805 http.c:690 => Send header: POST /r/mediawiki/core.git/git-upload-pack HTTP/1.1
11318:12:01.186809 http.c:690 => Send header: Host: gerrit.wikimedia.org
11418:12:01.186812 http.c:690 => Send header: User-Agent: git/2.24.3 (Apple Git-128)
11518:12:01.186824 http.c:690 => Send header: Accept-Encoding: deflate, gzip
11618:12:01.186828 http.c:690 => Send header: Content-Type: application/x-git-upload-pack-request
11718:12:01.186831 http.c:690 => Send header: Accept: application/x-git-upload-pack-result
11818:12:01.186835 http.c:690 => Send header: Git-Protocol: version=2
11918:12:01.186838 http.c:690 => Send header: Content-Length: 187
12018:12:01.186841 http.c:690 => Send header:
12118:12:01.186848 http.c:731 == Info: upload completely sent off: 187 out of 187 bytes
12218:12:01.825005 http.c:678 <= Recv header, 0000000017 bytes (0x00000011)
12318:12:01.825022 http.c:690 <= Recv header: HTTP/1.1 200 OK
12418:12:01.825029 http.c:678 <= Recv header, 0000000037 bytes (0x00000025)
12518:12:01.825033 http.c:690 <= Recv header: Date: Fri, 08 Oct 2021 16:12:01 GMT
12618:12:01.825038 http.c:678 <= Recv header, 0000000016 bytes (0x00000010)
12718:12:01.825042 http.c:690 <= Recv header: Server: Apache
12818:12:01.825046 http.c:678 <= Recv header, 0000000074 bytes (0x0000004a)
12918:12:01.825056 http.c:690 <= Recv header: Strict-Transport-Security: max-age=106384710; includeSubDomains; preload
13018:12:01.825069 http.c:678 <= Recv header, 0000000023 bytes (0x00000017)
13118:12:01.825074 http.c:690 <= Recv header: X-Frame-Options: DENY
13218:12:01.825078 http.c:678 <= Recv header, 0000000040 bytes (0x00000028)
13318:12:01.825082 http.c:690 <= Recv header: Expires: Tue, 01 Jan 1980 00:00:00 GMT
13418:12:01.825087 http.c:678 <= Recv header, 0000000018 bytes (0x00000012)
13518:12:01.825090 http.c:690 <= Recv header: Pragma: no-cache
13618:12:01.825095 http.c:678 <= Recv header, 0000000053 bytes (0x00000035)
13718:12:01.825099 http.c:690 <= Recv header: Cache-Control: no-cache, max-age=0, must-revalidate
13818:12:01.825104 http.c:678 <= Recv header, 0000000052 bytes (0x00000034)
13918:12:01.825107 http.c:690 <= Recv header: Content-Type: application/x-git-upload-pack-result
14018:12:01.825112 http.c:678 <= Recv header, 0000000045 bytes (0x0000002d)
14118:12:01.825115 http.c:690 <= Recv header: Backend-Timing: D=495370 t=1633709521266171
14218:12:01.825120 http.c:678 <= Recv header, 0000000028 bytes (0x0000001c)
14318:12:01.825124 http.c:690 <= Recv header: Transfer-Encoding: chunked
14418:12:01.825128 http.c:678 <= Recv header, 0000000002 bytes (0x00000002)
14518:12:01.825140 http.c:690 <= Recv header:
146remote: Counting objects: 8957, done
147remote: Finding sources: 100% (8957/8957)
148remote: Getting sizes: 100% (7335/7335)
149remote: Compressing objects: 99% (93223/93224)
15018:17:13.944725 http.c:731 == Info: TLSv1.2 (IN), TLS alert, close notify (256):
15118:17:13.944745 http.c:731 == Info: transfer closed with outstanding read data remaining
15218:17:13.944760 http.c:731 == Info: Closing connection 0
15318:17:13.944787 http.c:731 == Info: TLSv1.2 (OUT), TLS alert, close notify (256):
154error: RPC failed; curl 18 transfer closed with outstanding read data remaining
155fatal: the remote end hung up unexpectedly
156fatal: early EOF
157fatal: index-pack failed

Which on the backend side is:

2021-10-08T15:40:09.554Z] [HTTP POST /r/mediawiki/core.git/git-upload-pack (N/A from x.x.x.x)] WARN  org.eclipse.jetty.server.handler.ContextHandler.r : Internal error during upload-pack from /srv/gerrit/git/mediawiki/core.git
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 30000/30000 ms
        at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:234)
        at org.eclipse.jetty.server.HttpOutput.channelWrite(HttpOutput.java:269)
        at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:861)
        at net.bull.javamelody.internal.web.FilterServletOutputStream.write(FilterServletOutputStream.java:88)
        at net.bull.javamelody.internal.web.CounterResponseStream.write(CounterResponseStream.java:81)
        at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:123)
        at org.eclipse.jgit.util.TemporaryBuffer.write(TemporaryBuffer.java:128)
        at org.eclipse.jgit.transport.UploadPack$ResponseBufferedOutputStream.write(UploadPack.java:2371)
        at org.eclipse.jgit.transport.SideBandOutputStream.writeBuffer(SideBandOutputStream.java:141)
        at org.eclipse.jgit.transport.SideBandOutputStream.write(SideBandOutputStream.java:120)
        at org.eclipse.jgit.internal.storage.pack.PackOutputStream.write(PackOutputStream.java:99)
        at java.base/java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:253)
        at java.base/java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211)
        at java.base/java.io.FilterOutputStream.write(FilterOutputStream.java:108)
        at org.eclipse.jgit.lib.ObjectLoader.copyTo(ObjectLoader.java:237)
...

The related Gerrit setting is https://gerrit.wikimedia.org/r/Documentation/config-gerrit.html#httpd.idleTimeout

httpd.idleTimeout

Maximum idle time for a connection, which roughly translates to the TCP socket SO_TIMEOUT.

This value is interpreted as the maximum time between some progress being made on the connection. So if a single byte is read or written, then the timeout is reset.

The max idle time is applied:

  • When waiting for a new message to be received on a connection
  • When waiting for a new message to be sent on a connection

By default, 30 seconds.

Event Timeline

I did a bit of grep in /var/log/gerrit/error_log and /var/log/apache2/gerrit.wikimedia.org.https.error.log. In chronological order:

gerrit [2021-10-08T04:05:33.873Z] [HTTP POST /r/p/mediawiki/services/parsoid/git-upload-pack (N/A from x.x.x.x)]
WARN org.eclipse.jetty.server.handler.ContextHandler.r : Internal error during upload-pack from /srv/gerrit/git/mediawiki/services/parsoid.git

apache [Fri Oct 08 04:11:55.808583 2021] [proxy_http:error] [pid 32509:tid 139644083631872]
(70008)Partial results are valid but processing is incomplete: [client x.x.x.x:42684] AH01110: error reading response

gerrit [2021-10-08T05:22:04.533Z] [HTTP POST /r/mediawiki/services/parsoid/deploy/git-upload-pack (N/A from x.x.x.x)]
WARN org.eclipse.jetty.server.handler.ContextHandler.r : Internal error during upload-pack from /srv/gerrit/git/mediawiki/services/parsoid/deploy.git

apache [Fri Oct 08 05:24:45.882041 2021] [proxy_http:error] [pid 32509:tid 139644058453760]
(70008)Partial results are valid but processing is incomplete: [client x.x.x.x:42974] AH01110: error reading response

Gerrit times out after 30 seconds and I am guessing Apache timesout later on. It somehow doesn't detect Gerrit terminated or Gerrit improperly keeps the channel open :-(

The corresponding errors:

16:12:01.825140 http.c:690              <= Recv header:
	remote: Counting objects: 8957, done
	remote: Finding sources: 100% (8957/8957)
	remote: Getting sizes: 100% (7335/7335)
	remote: Compressing objects:  99% (93223/93224)

On Gerrit server side we have 38 seconds later:

[2021-10-08T16:12:39.410Z] [HTTP POST /r/mediawiki/core.git/git-upload-pack (N/A from )]
WARN  org.eclipse.jetty.server.handler.ContextHandler.r : Internal error during upload-pack from /srv/gerrit/git/mediawiki/core.git

Which is Gerrit hitting httpd.idleTimeout = 30s.

In Apache 5 minutes later:

[Fri Oct 08 16:17:12.854086 2021] [proxy_http:error] [pid 28634:tid 139644410783488]
(70008)Partial results are valid but processing is incomplete: [client x.x.x.x:16503] AH01110: error reading response

And finally a second later on the client side we end up with:

16:17:13.944725 http.c:731              == Info: TLSv1.2 (IN), TLS alert, close notify (256):
16:17:13.944745 http.c:731              == Info: transfer closed with outstanding read data remaining
16:17:13.944760 http.c:731              == Info: Closing connection 0
16:17:13.944787 http.c:731              == Info: TLSv1.2 (OUT), TLS alert, close notify (256):
	error: RPC failed; curl 18 transfer closed with outstanding read data remaining
	fatal: the remote end hung up unexpectedly
	fatal: early EOF
	fatal: index-pack failed

So it is two bug:

  • the httpd idletimeout is reached which might be jgit taking too long to generate the pack file to be send?
  • Apache should abort the connection earlier?

It works on my machine.