Page MenuHomePhabricator

Find a robust way of filtering local cache hits out of performance figures
Open, LowPublic

Description

Since we're seeing very different results when comparing varnish hits and misses for requests > 1s and requests between 20ms and 1s, it might be that the 20ms limit to filter local cache hits was inadequate. There must be a better way to establish that the current request hit the local cache or not.

Event Timeline

Gilles created this task.Jan 13 2015, 3:04 PM
Gilles raised the priority of this task from to Needs Triage.
Gilles updated the task description. (Show Details)
Gilles added a subscriber: Gilles.
Tgr added a subscriber: Tgr.Jan 13 2015, 11:36 PM

Small event_total (RT.duration) with old event_timestamp (HTTP.Date)?

this presentation claims to have a way of detecting cache hits (and has many other interesting claims) but is very dated.

Tgr added a comment.EditedJan 13 2015, 11:55 PM
In T86672#974833, @Tgr wrote:

Small event_total (RT.duration) with old event_timestamp (HTTP.Date)?

Which of course relies on an accurate local clock.

On Chrome I reliably see duration: 0 for cached requests; maybe it's browser-dependent and UA filtering can be used to get more accurate data?

I remember seeing Firefox on my own machine report values > 0 sometimes, so I'm a little suspicious of assumptions based on 0 values returned by ResourceTiming. The goal here is to catch all cache hits, not just the obvious ones. Even if Chrome works that way, it could change with an update.

I went through the spec and unfortunately there doesn't seem to be a way to tell cache hits and persistent connection hits apart.

Persistent connections were introduced in HTTP/1.1, I wonder if we could make the measured XHR an HTTP/1.0 request on purpose? This way we would avoid it piggybacking on an existing connection (granted that the browser doesn't mix those requests into an existing connection anyway). As far as I can see there's no way to do that, though.

Another technique might be to send a bogus HTTP/1.1 request with a Connection: close header before the real request, but that slows down the measured image load by a round trip. And other requests happening in parallel would mess things up.

Seems like we're stuck with hacks all around for now and no way to tell that a slightly different hack will be more accurate than another.

Tgr added a comment.Jan 14 2015, 9:39 PM

I went through the spec and unfortunately there doesn't seem to be a way to tell cache hits and persistent connection hits apart.

Yes, the only sane way to deal with this would be to expand the spec with some new field telling whether it was cached or not. The spec is fairly active (they added request sizes just two months ago, which will be very useful for us once it hits the browsers); I thumbed through the W3C mail archives, ResourceTiming issue tracker and NavigationTiming issue tracker (since NavigationTiming suffers from the same problem) and there is not much attention around the issue (there is a lone thread about it from half a year ago which was supportive but seems like it got forgotten), so I think we should raise the issue there.

Tgr added a comment.Jan 14 2015, 9:50 PM

Btw while searching I found this long thread which makes the point (amongst other things) that ResourceTiming.duration includes HTTP blocking time. (I.e. you fire off ten image requests, the browser has a limit of 8 simultaneous connections to the same domain, then the duration for the last two images will include the time while the first two images are downloaded and the last two are waiting for a connection to free up. This can be a very significant delay for MediaViewer due to T75951. Including this is good when we use the duration to measure MediaViewer speed but bad when we use it to measure image serving speed.

Jdforrester-WMF triaged this task as Low priority.Sep 4 2015, 6:56 PM
Restricted Application added a subscriber: Matanya. · View Herald TranscriptSep 4 2015, 6:56 PM
Jdforrester-WMF moved this task from Untriaged to Backlog on the Multimedia board.Sep 4 2015, 7:03 PM

Mass-removing the Multimedia tag from MediaViewer tasks, as this is now being worked on by the Reading department, not Editing's Multimedia team.

Milimetric set Security to None.
simon04 moved this task from Backlog to Tracking/metrics on the MediaViewer board.Jun 10 2019, 7:09 AM
Tgr removed a subscriber: Tgr.Jul 9 2019, 6:03 PM