Page MenuHomePhabricator

Reduce overhead of navtiming beacon
Closed, ResolvedPublic

Description

Timeline analysis:

Screen_Shot1.png (804×2 px, 250 KB)

  • There's a setInterval loop every 10ms throughout the loading process. This happens on all page views because even if not in-sample, there is oversampling and savetiming that run from onLoadCallback.
  • Ironically, during an in-sample page view of a simple page with nothing else on it, the only thing that polling loop is waiting for is our own lazy-load for ext.eventLogging and schema.NavigationTiming (the load event waits for all subresources added before the load event, including async scripts).

I propose two changes:

  • Get rid of the setInterval loop. This was added because in 2015 we didn't consider a simple check based on timing.loadEventStart, or document.readyState.
  • Get rid of the extra EL requests once T187207 settles.

Event Timeline

Change 428551 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/NavigationTiming@master] ext.navigationTiming: Remove setInterval poll for loadEventEnd

https://gerrit.wikimedia.org/r/428551

Change 428551 merged by jenkins-bot:
[mediawiki/extensions/NavigationTiming@master] ext.navigationTiming: Remove setInterval poll for loadEventEnd

https://gerrit.wikimedia.org/r/428551

Krinkle changed the task status from Open to Stalled.May 4 2018, 3:30 AM
Krinkle removed Krinkle as the assignee of this task.
Krinkle moved this task from Doing (old) to Blocked (old) on the Performance-Team board.

Change 452585 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/NavigationTiming@master] Remove 'visibilitychange' handler when it is no longer needed

https://gerrit.wikimedia.org/r/452585

Change 452585 merged by jenkins-bot:
[mediawiki/extensions/NavigationTiming@master] Remove 'visibilitychange' handler when it is no longer needed

https://gerrit.wikimedia.org/r/452585

Change 458081 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/NavigationTiming@master] Use mw.eventLog.inSample() instead of copying it

https://gerrit.wikimedia.org/r/458081

Change 458081 merged by jenkins-bot:
[mediawiki/extensions/NavigationTiming@master] Use mw.eventLog.inSample() instead of copying it

https://gerrit.wikimedia.org/r/458081

Change 475570 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/NavigationTiming@master] Remove use of deprecated 'schema.*' modules

https://gerrit.wikimedia.org/r/475570

Change 475570 merged by jenkins-bot:
[mediawiki/extensions/NavigationTiming@master] Remove use of deprecated 'schema.*' modules

https://gerrit.wikimedia.org/r/475570

Change 479987 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/NavigationTiming@master] Use single/lightweight EventLogging module

https://gerrit.wikimedia.org/r/479987

There is still one network fetch remaining that the Navigation Timing client is starting before the load event (and thus delays the load event itself by one roundtrip). That is the lazy-loading of RSI.

There's many ways ways we could solve that. Here's a few ideas I could think of:

  • Separate RSI to a different schema and switch from preloading RSI to post-loading RSI (e.g. wait until the load event, and then lazy-load the module).
  • Move all of navigation timing to after the load event (might lead to bias and loss of data due to only capturing page loads where users stay around long enough to download and render all below-the-flow HTML and have a working network throughout that and beyond for an additional 2-3 roundtrips).
  • Fork rum-speedindex.js, trim it down to a bare minimum for the subset of features and browsers we use it in, and bundle it as secondary file in the main module (not lazy-loaded).
  • Re-evaluate whether we still need it.

Thoughts?

Krinkle changed the task status from Stalled to Open.Dec 16 2018, 7:28 PM

Change 479987 merged by jenkins-bot:
[mediawiki/extensions/NavigationTiming@master] Use single/lightweight EventLogging module

https://gerrit.wikimedia.org/r/479987

RSI has fared ok compared to other metrics when it comes to direct correlation to survey responses: https://phabricator.wikimedia.org/T187299#4797170

With that in mind, considering that we intend to run some deep learning in the future as a follow-up to the machine learning attempt, we need as many features as possible, and RSI compiles a bunch of ResourceTiming metrics that we don't normally capture.

I think it's worth keeping, I would be of the opinion that it should be decoupled and moved to its own schema, just like we've done for the CPU benchmark, for example.

Change 486505 had a related patch set uploaded (by Gilles; owner: Gilles):
[mediawiki/extensions/NavigationTiming@master] Decouple RSI and measure it after onload

https://gerrit.wikimedia.org/r/486505

Change 486505 merged by jenkins-bot:
[mediawiki/extensions/NavigationTiming@master] Decouple RSI and measure it after onload

https://gerrit.wikimedia.org/r/486505

@Krinkle Did you have anything else in mind to do for this task, or is this it?

That's it, I think. Thanks!

Change 556735 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[mediawiki/extensions/ImageMetrics@master] Remove use of deprecated 'schema.*' module

https://gerrit.wikimedia.org/r/556735

Change 556735 abandoned by Jforrester:
Remove use of deprecated 'schema.*' module

https://gerrit.wikimedia.org/r/556735