Page MenuHomePhabricator

Navigation Timing cleanup
Closed, ResolvedPublic

Description

We have an upcoming task about evaluating the Navigation Timing extension: Either trying to make it more generic and open source it to make it possible for others to use it or start using Boomerang or something similar.

I think as a first step we can do a cleanup up of the extension. There are some important metrics missing that we should include and there are code we can remove:

  • Remove RUM Speed Index T286700
  • Implement Largest Contentful Paint T281022
  • Collect Cumulative Layout Shift following Googles change T281103
  • What should we do with the PerformanceSurvey? Who owns it, who takes care of it, who wants to drive it?

Event Timeline

Change 719110 had a related patch set uploaded (by Phedenskog; author: Phedenskog):

[mediawiki/extensions/NavigationTiming@master] Remove measure top image using the resource timing API.

https://gerrit.wikimedia.org/r/719110

Change 719110 merged by jenkins-bot:

[mediawiki/extensions/NavigationTiming@master] Remove measure top image using the resource timing API.

https://gerrit.wikimedia.org/r/719110

Change 859638 had a related patch set uploaded (by Krinkle; author: Krinkle):

[performance/navtiming@master] Remove transfer_size from navtiming_responsestart_by_cache_host

https://gerrit.wikimedia.org/r/859638

Change 859639 had a related patch set uploaded (by Krinkle; author: Krinkle):

[performance/navtiming@master] Remove ua_version from Prometheus painttiming_seconds

https://gerrit.wikimedia.org/r/859639

Change 859640 had a related patch set uploaded (by Krinkle; author: Krinkle):

[performance/navtiming@master] Remove unused 'group' from navtiming_invalid_events

https://gerrit.wikimedia.org/r/859640

Change 859638 merged by jenkins-bot:

[performance/navtiming@master] Remove transfer_size from navtiming_responsestart_by_cache_host

https://gerrit.wikimedia.org/r/859638

Change 859639 merged by jenkins-bot:

[performance/navtiming@master] Remove ua_version from Prometheus painttiming_seconds

https://gerrit.wikimedia.org/r/859639

Change 859640 merged by jenkins-bot:

[performance/navtiming@master] Remove unused 'group' from navtiming_invalid_events

https://gerrit.wikimedia.org/r/859640

Change 861346 had a related patch set uploaded (by Krinkle; author: Phedenskog):

[mediawiki/extensions/NavigationTiming@master] Remove ElementTiming measurements

https://gerrit.wikimedia.org/r/861346

Change 861346 merged by jenkins-bot:

[mediawiki/extensions/NavigationTiming@master] Remove ElementTiming measurements

https://gerrit.wikimedia.org/r/861346

Change 865097 had a related patch set uploaded (by Krinkle; author: Krinkle):

[operations/mediawiki-config@master] Turn off wgNavigationTimingOversampleFactor campaigns

https://gerrit.wikimedia.org/r/865097

Change 865109 had a related patch set uploaded (by Krinkle; author: Krinkle):

[performance/navtiming@master] Switch remainig use of `frontend.*_discard` to `eventlogging.client_errors`

https://gerrit.wikimedia.org/r/865109

Change 865109 merged by jenkins-bot:

[performance/navtiming@master] Switch remainig use of `frontend.*_discard` to `eventlogging.client_errors`

https://gerrit.wikimedia.org/r/865109

Change 865097 merged by jenkins-bot:

[operations/mediawiki-config@master] Turn off wgNavigationTimingOversampleFactor campaigns

https://gerrit.wikimedia.org/r/865097

Mentioned in SAL (#wikimedia-operations) [2023-01-05T21:33:56Z] <samtar@deploy1002> Started scap: Backport for [[gerrit:865097|Turn off wgNavigationTimingOversampleFactor campaigns (T286703)]]

Mentioned in SAL (#wikimedia-operations) [2023-01-05T21:35:35Z] <samtar@deploy1002> samtar and krinkle: Backport for [[gerrit:865097|Turn off wgNavigationTimingOversampleFactor campaigns (T286703)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-01-05T21:42:42Z] <samtar@deploy1002> Finished scap: Backport for [[gerrit:865097|Turn off wgNavigationTimingOversampleFactor campaigns (T286703)]] (duration: 08m 45s)

Change 876034 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/extensions/NavigationTiming@master] Stop collecting transferSize

https://gerrit.wikimedia.org/r/876034

[operations/mediawiki-config@master] Turn off some wgNavigationTimingOversampleFactor campaigns

https://gerrit.wikimedia.org/r/865097

Drop on https://grafana.wikimedia.org/d/000000494/eventlogging-schema-jumbo:

Screenshot 2023-01-06 at 00.38.44.png (840×1 px, 126 KB)

And in context of T326118, I suppose we can look at CPU/Network as well:

https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=webperf1003

Screenshot 2023-01-06 at 00.27.30.png (962×1 px, 87 KB)

And to confirm the regular reporting isn't affected, the navtiming dashboard's report rate is still the same at https://grafana.wikimedia.org/d/000000143/navigation-timing?viewPanel=12&from=now-24h&to=now:

navtiming2navtiming2_oversample
Screenshot 2023-01-06 at 00.45.30.png (780×1 px, 178 KB)
Screenshot 2023-01-06.png (812×1 px, 159 KB)

Change 876034 merged by jenkins-bot:

[mediawiki/extensions/NavigationTiming@master] Stop collecting Navigation Timing transferSize

https://gerrit.wikimedia.org/r/876034

Change 726852 had a related patch set uploaded (by Krinkle; author: Peter Hedenskog):

[operations/puppet@production] eventlogging: Remove obsoleted navtiming schemas

https://gerrit.wikimedia.org/r/726852

Change 879926 had a related patch set uploaded (by Krinkle; author: Krinkle):

[operations/mediawiki-config@master] Remove former EventLogging streams for navtiming

https://gerrit.wikimedia.org/r/879926

Change 879926 merged by jenkins-bot:

[operations/mediawiki-config@master] Remove former EventLogging streams for navtiming

https://gerrit.wikimedia.org/r/879926

Mentioned in SAL (#wikimedia-operations) [2023-02-01T08:17:44Z] <ladsgroup@deploy1002> Started scap: Backport for [[gerrit:879926|Remove former EventLogging streams for navtiming (T281103 T286703 T308621 T323623)]]

Mentioned in SAL (#wikimedia-operations) [2023-02-01T08:19:32Z] <ladsgroup@deploy1002> ladsgroup and krinkle: Backport for [[gerrit:879926|Remove former EventLogging streams for navtiming (T281103 T286703 T308621 T323623)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-02-01T08:27:27Z] <ladsgroup@deploy1002> Finished scap: Backport for [[gerrit:879926|Remove former EventLogging streams for navtiming (T281103 T286703 T308621 T323623)]] (duration: 09m 42s)

Change 726852 merged by Ottomata:

[operations/puppet@production] eventlogging: Remove obsoleted navtiming schemas

https://gerrit.wikimedia.org/r/726852

Change 887425 had a related patch set uploaded (by Krinkle; author: Krinkle):

[schemas/event/secondary@master] Remove elementtiming,firstinputtiming,layoutshift,resourcetiming,rumspeedindex

https://gerrit.wikimedia.org/r/887425

Change 887425 merged by jenkins-bot:

[schemas/event/secondary@master] Remove elementtiming,firstinputtiming,layoutshift,resourcetiming,rumspeedindex

https://gerrit.wikimedia.org/r/887425

Krinkle claimed this task.

What should we do with the PerformanceSurvey? Who owns it, who takes care of it, who wants to drive it?

I forgot to write it down on-task, but sometime in 2022, Peter and I decided to keep the survey running for now and keep monitoring perceived performance long-term as a way to learn how perceiced experience evolves over time, whether it remains correlated, and to have the infra ready to go if we want to evaluate new metrics in the future.

I found myself summaring our learnings again recently so I wrote it down at
https://wikitech.wikimedia.org/wiki/Performance/Essay/Performance_survey_(2019).

In writing this up, I also re-read Gilles' blogpost on the topic, which also contains a similar explaination for why we kept it at the time:

The performance perception micro survey will keep running and will allow us to benchmark future APIs.