Page MenuHomePhabricator

Collect RUMSpeedIndex from users
Closed, ResolvedPublic

Description

Since it can load and run after the critical path, we should treat everything about it as low priority in terms of loading and execution. I.e:

  • check if the pageload is in the sample (could be done in the navtiming JS for example)
  • if so, load the speedindex module with low prio
  • attempt to calculate rumspeedindex and report it

If the user navigates away at any point, it's fine. No need to make this resilient.

Event Timeline

Gilles triaged this task as Medium priority.

Change 392381 had a related patch set uploaded (by Gilles; owner: Gilles):
[mediawiki/extensions/NavigationTiming@master] Collect RUMSpeedIndex with NavigationTiming

https://gerrit.wikimedia.org/r/392381

I've been running both for a while, this is what it looks like:

Barack Obama Firefox

Screen Shot 2017-11-20 at 9.56.36 PM.png (1×2 px, 290 KB)

Barack Obama Chrome

Screen Shot 2017-11-20 at 9.58.08 PM.png (1×2 px, 379 KB)

Sweden Firefox

Screen Shot 2017-11-20 at 10.00.14 PM.png (1×2 px, 294 KB)

Sweden Chrome

Screen Shot 2017-11-20 at 9.59.26 PM.png (1×2 px, 453 KB)

I think we should ask Pat if he thinks it worth using, I remember he did some major testing when he released it.

One thing I noticed is that the highs we have in SpeedIndex when we run the banner, isn't reflected in the RUM SpeedIndex.

Is it possible that in your tests, the rumSpeedIndex is computed before the banner has appeared?

I don't have any argument against this metric, but in light of our quarter goal to focus on a set of key metrics that are well-defined, I think we should better define our intention with this metric before we start collecting it. Including under what circumstances we would remove it (in the future).

Actual SpeedIndex is, while somewhat arbitrarily, very well defined. It is based on a single source (video capture) and a single metric (visual completeness). It is based on penalising the passing of time as the picture completes.

There is no single data source available on the web to compute something similar. RUM-SpeedIndex instead approximates it based on FirstPaint or (if unsupported) based on download times of CSS, images and other sub resources. This is totally understandable because that's the closest information we have on the web related to when the browser should be able to render something (based the standardised semantics of HTML rendering). It is, however, much lower than actual first paint or SpeedIndex as it can't measure parsing, processing and rendering time after the network downloads complete. The library is currently quite experimental with no large-scale deployment (as far as I know).

If we want to measure the time to complete the download of render-blocking CSS, that could be its own well-defined metric I think. But in its current form, I'm unsure how we'd use this metric. Especially because it doesn't always compute things the same way (various fallbacks and differences between what browsers offer it can use). Open to ideas :)

I think it could be useful as-is, with the understanding that it only works well on a specific set of browsers, because it extracts a potentially useful metric from a large set of very fine-grained metrics we don't collect currently and probably never will (precisely because it's too fine-grained).

And we can't really know how it behaves in the real world until we try it.

I think it would also be fine to not collect it on browsers that use too many fallbacks, to the point that the metric is meaningless on them.

Resource Timing fallback is useless to us, we should skip collecting this if that's all we have access to it. We should check for firstPaint, browsers that don't have it will be less accurate.

Change 392381 merged by jenkins-bot:
[mediawiki/extensions/NavigationTiming@master] Collect RUMSpeedIndex with NavigationTiming

https://gerrit.wikimedia.org/r/392381