Page MenuHomePhabricator

Collect extra data and oversample views with higher page load times
Open, LowPublic


It would be really sweet to be able to oversample page loads that took more than some threshold to load (eg, page load took more than 10 seconds, 30 seconds, whatever).

We could do some post-processing on these to understand whether they were a result of client side network issues or something more fundamental on our end, and potentially ignore them in the former case.

Event Timeline

Imarlier created this task.Dec 4 2017, 2:36 AM
Restricted Application removed a project: Patch-For-Review. · View Herald TranscriptDec 4 2017, 2:36 AM
Peter added a subscriber: Peter.Dec 4 2017, 9:02 AM

I wonder if we could do something smart here with the Resource Timing API? Or what kind of things do we want to catch? Like if the latency looks the same for all resources there's no need to collect the data?

Nuria added a comment.Dec 4 2017, 4:48 PM

My 2 cents here: I think it will be worth checking where are your percentiles 90 and 99 now (per country probably). If data jumps a lot there it is telling of something going on at the long tail of the timeseries, either there is not enough data or there are some hidden effects we are not seeing. Now, if those numbers are very stable it is really not worthy to spend time troubleshotting what might be complete outliers to the series.

Imarlier triaged this task as Medium priority.Jan 18 2018, 2:37 PM
Krinkle renamed this task from Oversample based on total load time to Collect extra data and oversample views with higher page load times.May 29 2018, 9:01 PM
Krinkle updated the task description. (Show Details)
Krinkle lowered the priority of this task from Medium to Low.Jun 21 2018, 10:45 AM
kchapman removed Imarlier as the assignee of this task.Jan 25 2019, 2:55 AM
Gilles claimed this task.Apr 12 2019, 11:59 AM
Gilles added subscribers: Gilles, Krinkle.

We plan on ramping up the RUM sampling factor by a lot, which reduces the need for this somewhat. However, we'd still like to also do a non-default beacon with extra information that might help learn why it is slow, for example, a compressed breakdown of Resource Timing information.

The ResourceTiming beacon could be something we then only send from a fraction of users and slow page views.