The stalling issue mentioned in T131502 seems to involve two subsequent Range requests for large objects in HTTP persistent connections while the object is being written to disk. The details are still a bit unclear to me, but I found a reliable way to reproduce the bug.
Start Varnish with stock VCL and a 10G file storage backend. Also make sure a web server is running on port 80 and serving a large file under /test.img. I've used a 5G file.
sudo varnishd -a :3128 -b localhost:80 -F -n backend -s main1=file,/var/tmp/varnish.main1,10G
Run the following script:
#!/usr/bin/env python import socket REPRODUCE = True dest_name = "127.0.0.1" port = 3128 s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((dest_name, port)) s.send("""GET /test.img HTTP/1.1 Range: bytes=18-22 """) data = (s.recv(1024)) print data if not REPRODUCE: s.shutdown(1) s.close() s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((dest_name, port)) s.send("""GET /test.img HTTP/1.1 Range: bytes=50-60 """) data = (s.recv(1024)) print data s.shutdown(1) s.close()
The first response will come back quickly and then varnish will stall:
HTTP/1.1 206 Partial Content
Date: Fri, 05 Aug 2016 17:26:11 GMT
Server: Apache/2.4.23 (Debian)
Last-Modified: Thu, 04 Aug 2016 18:38:44 GMT
ETag: "138800000-5394342740f66"
X-Varnish: 2
Age: 0
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Content-Range: bytes 18-22/5242880000
Content-Length: 5
Connection: keep-alive
{`̇After a while, the second response will arrive. Note the value of Age.
HTTP/1.1 206 Partial Content Date: Fri, 05 Aug 2016 17:26:11 GMT Server: Apache/2.4.23 (Debian) Last-Modified: Thu, 04 Aug 2016 18:38:44 GMT ETag: "138800000-5394342740f66" X-Varnish: 4 3 Age: 18 Via: 1.1 varnish-v4 Accept-Ranges: bytes Content-Range: bytes 50-60/5242880000 Content-Length: 11 Connection: keep-alive "JÚQ b
Now stop varnishd, start it again, set REPRODUCE = False in the python script and run it again. Varnish should not stall, and the two responses should come back immediately. Again, notice the value of Age.
HTTP/1.1 206 Partial Content
Date: Fri, 05 Aug 2016 17:28:44 GMT
Server: Apache/2.4.23 (Debian)
Last-Modified: Thu, 04 Aug 2016 18:38:44 GMT
ETag: "138800000-5394342740f66"
X-Varnish: 2
Age: 0
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Content-Range: bytes 18-22/5242880000
Content-Length: 5
Connection: keep-alive
{`̇
HTTP/1.1 206 Partial Content
Date: Fri, 05 Aug 2016 17:28:44 GMT
Server: Apache/2.4.23 (Debian)
Last-Modified: Thu, 04 Aug 2016 18:38:44 GMT
ETag: "138800000-5394342740f66"
X-Varnish: 32770 3
Age: 0
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Content-Range: bytes 50-60/5242880000
Content-Length: 11
Connection: keep-alive
"JÚQ bIn the 2-varnishes scenario described in T131502, modifying the frontend VCL by adding set bereq.http.Connection = "close"; to vcl_backend_fetch makes the bug unreproducible.