Maniphest T205580

Microbenchmark device power and record results in NavigationTiming
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• Gilles
	Sep 26 2018, 8:44 PM

Description

Given that for iOS we can't tell iphone generations apart, it would be nice to have an overall device power/performance score recorded in NavigationTiming, to figure out in the performance survey if there is a correlation between satisfaction and device generation.

Details

Subject	Repo	Branch	Lines +/-
Terminate worker once CPU benchmark is done	mediawiki/extensions/NavigationTiming	master	+1 -0
Add CPU benchmark	mediawiki/extensions/NavigationTiming	wmf/1.32.0-wmf.24	+176 -50
Add CPU benchmark	mediawiki/extensions/NavigationTiming	master	+176 -50

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Resolved	• Gilles	T165272 Review research on performance perception
Declined	• Gilles	T184510 Ideas for performance perception studies
Resolved	• Gilles	T187299 User-perceived page load performance study
Resolved	• Gilles	T205580 Microbenchmark device power and record results in NavigationTiming

Event Timeline

• Gilles created this task.Sep 26 2018, 8:44 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 26 2018, 8:44 PM

• Gilles triaged this task as Medium priority.Sep 26 2018, 8:44 PM

• Gilles added a parent task: T187299: User-perceived page load performance study.

Doing something very stupid (https://jsfiddle.net/fnr430cw/):

performance.mark('foo-start');

var amount = 150000000;

for (var i = amount; i>0; i--) {} 

performance.mark('foo-end');

performance.measure('foo', 'foo-start', 'foo-end');

var measures = performance.getEntriesByName("foo");
var measure = measures[0];

document.write( Math.round( measure.duration ) );

Yields the following scores on real devices on Saucelabs:

iphone 6: 1286
moto e2: 817
moto e: 815
nexus 5x: 351
galaxy s6: 330
ipad air 2: 131
macbook pro (my own machine): 108

Now let's check if this score correlates to the CPU power/generation of these devices.

iphone 6 (2014): 1.4 GHz Apple A8, 2 cores, 20nm
moto e2 (2015): 1.2 GHz , 2 cores, 28nm
moto e (2015): 1.2 GHz, 4 cores, 28nm
nexus 5x (2015): 1.44 - 2.82 GHz, 2 + 4 cores, 20nm
galaxy s6 (2015): 1.5 - 2.1 GHz, 4 + 4 cores, 14nm
ipad air 2 (2014): 1.5 GHz, 3 cores, 20nm
macbook pro (2016): 2.9 GHz Intel Core i7, quad-core, 14nm

Seems fairly logical, with the main surprise being how well the ipad performs.

There's pretty decent support for WebWorkers (looks like the same or a superset of browsers that support NavTiming and Perf.now API). https://caniuse.com/#search=worker

Would be better to run there rather than have uninterruptible main thread tasks (even if under 50ms). And I imagine that running it in a way that is consistently less than e.g. 10ms on slow devices, would make it not as representative? Worth checking though.

Good idea, I'll try that

Web worker version: http://jsfiddle.net/qnepd1r6/

Scores:

iphone 6: 5487
ipad air 2: 1118
moto e2: 883
moto e: 869
nexus 5x: 495
galaxy s6: 466
macbook pro (my own machine): 251

With that version, the ipad regains a more logical spot. The iphone 6, on the other hand, becomes absolutely terrible for some reason. Given the huge gap (generation difference nonwithstanding) between a real iOS device in person and a real iOS device on saucelabs, the slow performance could be related to the testbed. To verify this, I asked folks to open the fiddle on their real iphones, with the following results:

iphone 6s: 3769
iphone 5s: 2785
iphone 5s: 1883
iphone SE: 1149 (maybe that person confused it with a 5s? they look the same)
iphone 7: 859
iphone 5s: 648 (same here, possible confusion with an SE?)
iphone SE: 433
iphone 7+ (my own): 432
iphone SE: 399
iphone 8: 365
iphone X: 362
ipad 6: 341
iphone Xs max: 316

Seeing the whole range makes the ranking logical. With some exceptions, that I'm guessing could be due to different iOS versions. The iphone 6 still sticks out like a sore thumb. It could indeed be a real thing, with its effective processing power below previous generations.

So far this technique looks like a viable way to infer device generation/power.

Change 463440 had a related patch set uploaded (by Gilles; owner: Gilles):
[mediawiki/extensions/NavigationTiming@master] Add CPU benchmark

https://gerrit.wikimedia.org/r/463440

gerritbot added a project: Patch-For-Review.Sep 28 2018, 9:28 AM

• Gilles moved this task from Inbox, needs triage to Doing (old) on the Performance-Team board.Oct 1 2018, 8:14 PM

Change 464092 had a related patch set uploaded (by Krinkle; owner: Gilles):
[mediawiki/extensions/NavigationTiming@wmf/1.32.0-wmf.24] Add CPU benchmark

https://gerrit.wikimedia.org/r/464092

Change 463440 merged by jenkins-bot:
[mediawiki/extensions/NavigationTiming@master] Add CPU benchmark

https://gerrit.wikimedia.org/r/463440

In T205580#4624575, @gerritbot wrote:

[mediawiki/extensions/NavigationTiming@master] Add CPU benchmark
https://gerrit.wikimedia.org/r/463440

In retrospect, I think the Worker became more than an optimisation. It's avoiding notable freezing of the browser given the bench can take 3-5 seconds. We can now afford this easily and get more representative data, compared to having a much shorter bench on the main thread.

On the other hand, I do want to recognise that it's taken considerable effort to get right. I did recommended it, but I've also had little to no experience with workers in production. It's been interesting to test and debug it in different browsers to see how this works.

There was one thing I left out in CR that I think we should address "soon" if we're going to run it for more than a few days – which is, to make sure the Worker is terminated after the benchmark. I noticed in the past day during unrelated development locally (with this patch still applied) that memory and perf profiles kept showing this extra thread persisting, with a startup thread memory footprint of 7 MB, which isn't huge on its own, but considering that a normal browsing context in Chrome (main thread, en.wikipedia.org large article) is ~ 21 MB, that's a significant addition to the RAM we occupy on sampled page views.

Should be relatively straight-forward to do, but didn't want to block it for this week. Maybe for next week?

Change 464092 merged by jenkins-bot:
[mediawiki/extensions/NavigationTiming@wmf/1.32.0-wmf.24] Add CPU benchmark

https://gerrit.wikimedia.org/r/464092

Mentioned in SAL (#wikimedia-operations) [2018-10-03T04:45:09Z] <krinkle@deploy1001> Synchronized php-1.32.0-wmf.24/extensions/NavigationTiming: T205580 - I04c52658fbf6d (duration: 01m 03s)

ReleaseTaggerBot added a project: MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)).Oct 3 2018, 5:00 AM

Change 464133 had a related patch set uploaded (by Gilles; owner: Gilles):
[mediawiki/extensions/NavigationTiming@master] Terminate worker once CPU benchmark is done

https://gerrit.wikimedia.org/r/464133

Change 464133 merged by jenkins-bot:
[mediawiki/extensions/NavigationTiming@master] Terminate worker once CPU benchmark is done

https://gerrit.wikimedia.org/r/464133

I've verified that the scores are being collected correctly and the values make sense when compared to device type on Android.

Microbenchmark device power and record results in NavigationTimingClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Microbenchmark device power and record results in NavigationTiming
Closed, ResolvedPublic
Actions

Related Objects
Search...