Since we're seeing very different results when comparing varnish hits and misses for requests > 1s and requests between 20ms and 1s, it might be that the 20ms limit to filter local cache hits was inadequate. There must be a better way to establish that the current request hit the local cache or not.
Which of course relies on an accurate local clock.
On Chrome I reliably see duration: 0 for cached requests; maybe it's browser-dependent and UA filtering can be used to get more accurate data?
I remember seeing Firefox on my own machine report values > 0 sometimes, so I'm a little suspicious of assumptions based on 0 values returned by ResourceTiming. The goal here is to catch all cache hits, not just the obvious ones. Even if Chrome works that way, it could change with an update.
I went through the spec and unfortunately there doesn't seem to be a way to tell cache hits and persistent connection hits apart.
Persistent connections were introduced in HTTP/1.1, I wonder if we could make the measured XHR an HTTP/1.0 request on purpose? This way we would avoid it piggybacking on an existing connection (granted that the browser doesn't mix those requests into an existing connection anyway). As far as I can see there's no way to do that, though.
Another technique might be to send a bogus HTTP/1.1 request with a Connection: close header before the real request, but that slows down the measured image load by a round trip. And other requests happening in parallel would mess things up.
Seems like we're stuck with hacks all around for now and no way to tell that a slightly different hack will be more accurate than another.
Yes, the only sane way to deal with this would be to expand the spec with some new field telling whether it was cached or not. The spec is fairly active (they added request sizes just two months ago, which will be very useful for us once it hits the browsers); I thumbed through the W3C mail archives, ResourceTiming issue tracker and NavigationTiming issue tracker (since NavigationTiming suffers from the same problem) and there is not much attention around the issue (there is a lone thread about it from half a year ago which was supportive but seems like it got forgotten), so I think we should raise the issue there.
Btw while searching I found this long thread which makes the point (amongst other things) that ResourceTiming.duration includes HTTP blocking time. (I.e. you fire off ten image requests, the browser has a limit of 8 simultaneous connections to the same domain, then the duration for the last two images will include the time while the first two images are downloaded and the last two are waiting for a connection to free up. This can be a very significant delay for MediaViewer due to T75951. Including this is good when we use the duration to measure MediaViewer speed but bad when we use it to measure image serving speed.