Page MenuHomePhabricator

Create a static version of [[en:Barack Obama]] in mw-config/speedtests with OOjs UI core loaded
Closed, ResolvedPublic

Description

This will allow us to understand the current performance implications of switching it on for all read pages.

Event Timeline

Jdforrester-WMF raised the priority of this task from to High.
Jdforrester-WMF updated the task description. (Show Details)

Change 271144 had a related patch set uploaded (by Jforrester):
Speed trials: Add mobile and desktop versions with OOjs UI core loaded

https://gerrit.wikimedia.org/r/271144

Change 271144 merged by jenkins-bot:
Speed trials: Add mobile and desktop versions with OOjs UI core loaded

https://gerrit.wikimedia.org/r/271144

Hey Peter,

We now have four pages to compare:

Could you give us a steer on the performance implications, using Web Page Speed Testing? Not sure exactly what I'm asking for (cc @ori). :-)

Peter added a subscriber: Peter.Hedenskog.
Peter subscribed.

Ok, I did a first check. I've tested doing 9 runs each, only looking at the first view, taking both the fastest SpeedIndex run and fastest start rendering (remember SpeedIndex is a way of calculating when the content within the viewport is ready (lower number is better) and start render is when the first time something happens on the screen).

PageHowStart render (fastest)Start render (median)SpeedIndex (fastest)SpeedIndex (median)
now desktopChrome Cable1.175 s2.685 s22302752
OOUI loaded desktopChrome Cable1.772 s2.986 s20913200
now desktopFirefox Cable1.729 s4.281 s17004490
OOUI loaded desktopFirefox Cable1.397 s2.281 s14092309
now mobileChrome (emulated Iphone 6) 2G13.985 s14.275 s1481815157
OOUI loaded mobileChrome (emulated Iphone 6) 2G12.986 s15.474 s1433816478
now mobileMoto G 3G4.887 s5.172 s55625780
OOUI loaded mobileMoto G 3G4.900 s5.176 s55955717

All the tests are linked in the page if you want to look at the result yourself. To change how WebPageTest choose which test that are the median and what metric to base that on:
?medianRun=fastest&medianMetric=render
?medianRun=fastest&medianMetric=SpeedIndex
?medianRun=median&medianMetric=render
?medianRun=median&medianMetric=SpeedIndex

Let me do a summary tomorrow.

Ok, I did a first check. I've tested doing 9 runs each, only looking at the first view, taking both the fastest SpeedIndex run and fastest start rendering (remember SpeedIndex is a way of calculating when the content within the viewport is ready (lower number is better) and start render is when the first time something happens on the screen).

PageHowStart render (fastest)Start render (median)SpeedIndex (fastest)SpeedIndex (median)
now desktopChrome Cable1.175 s2.685 s22302752
OOUI loaded desktopChrome Cable1.772 s2.986 s20913200
now desktopFirefox Cable1.729 s4.281 s17004490
OOUI loaded desktopFirefox Cable1.397 s2.281 s14092309
now mobileChrome (emulated Iphone 6) 2G13.985 s14.275 s1481815157
OOUI loaded mobileChrome (emulated Iphone 6) 2G12.986 s15.474 s1433816478
now mobileMoto G 3G4.887 s5.172 s55625780
OOUI loaded mobileMoto G 3G4.900 s5.176 s55955717

All the tests are linked in the page if you want to look at the result yourself. To change how WebPageTest choose which test that are the median and what metric to base that on:
?medianRun=fastest&medianMetric=render
?medianRun=fastest&medianMetric=SpeedIndex
?medianRun=median&medianMetric=render
?medianRun=median&medianMetric=SpeedIndex

Let me do a summary tomorrow.

Could you add an 'X-Wikimedia-Debug: 1' header to the requests? That way, we'll know that any difference is not due to something silly like the OOjs UI variant not being in Varnish. Alternately, a cache-busting query param should do the trick too.

yep let me do that. will run on a private instance today so we can make more than 9 runs to see what happens with that median numbers.

Status update: I've been testing WPT Bulk tester all day but when I check the metrics for mobile it doesn't seems realistic on 2G connections (too fast). Testing now on European instance to see what happens. I'll share the script and result in a couple of hours.

Finally I got it running the way I want. I'll write some docs at WikiTech later today about what I needed todo to get it to work the way I wanted.

This is how I've done the test: Private instance (Ireland), 31 runs for each test and take the median run using SpeedIndex. I've tested desktop original & OOUI in Chrome & Firefox throttling as cable and mobile original OOUI using Chrome emulating mobile and running 3G and 2G. All requests has the 'X-Wikimedia-Debug: 1' header.

TestRaw dataStart Render (ms)Visually complete (ms)Load Time (ms)SpeedIndex
Original desktop Chromehttp://wpt.wmftest.org/result/160218_PY_9A/5474910068775543
OOUI desktop Chromehttp://wpt.wmftest.org/result/160218_NS_9B/5182900066115243
Original desktop Firefoxhttp://wpt.wmftest.org/result/160218_PH_9C/6436990080226435
OOUI desktop Firefoxhttp://wpt.wmftest.org/result/160218_PH_9D/6448980097756434
Original mobile 3Ghttp://wpt.wmftest.org/result/160218_HW_9E/4987950090495215
OOUI mobile 3Ghttp://wpt.wmftest.org/result/160218_KT_9F/49781040095565195
Original mobile 2Ghttp://wpt.wmftest.org/result/160218_RS_9G/12681183002291412928
OOUI mobile 2Ghttp://wpt.wmftest.org/result/160218_6G_9H/12780191002185913034

@ori & @Krinkle have I missed anything?

Ok, did a re-run again without the debug header. Again taking the median of 31 runs for SpeedIndex.

TestRaw dataStart Render (ms)Visually complete (ms)Load Time (ms)SpeedIndex
Original desktop Chromehttp://wpt.wmftest.org/result/160218_16_BR/3597790073063768
OOUI desktop Chromehttp://wpt.wmftest.org/result/160218_RX_BS/3079760061013292
Original desktop Firefoxhttp://wpt.wmftest.org/result/160218_HY_BT/6420950093216431
OOUI desktop Firefoxhttp://wpt.wmftest.org/result/160218_DZ_BV/6485910069916528
Original mobile 3Ghttp://wpt.wmftest.org/result/160218_M4_BW/4291870078574552
OOUI mobile 3Ghttp://wpt.wmftest.org/result/160218_R3_BX/4290850098184506
Original mobile 2Ghttp://wpt.wmftest.org/result/160218_Y5_BY/11997243002218612536
OOUI mobile 2Ghttp://wpt.wmftest.org/result/160218_91_BZ/12193212002686012552

So…

TestRaw dataStart Render (ms)Visually complete (ms)Load Time (ms)SpeedIndex
Original desktop Chromehttp://wpt.wmftest.org/result/160218_16_BR/3597790073063768
OOUI desktop Chromehttp://wpt.wmftest.org/result/160218_RX_BS/3079 (-14.4%)7600 (-3.8%)6101 (-16.49%)3292 (-12.63%)
Original desktop Firefoxhttp://wpt.wmftest.org/result/160218_HY_BT/6420950093216431
OOUI desktop Firefoxhttp://wpt.wmftest.org/result/160218_DZ_BV/6485 (+1.01%)9100 (-4.21%)6991 (-25.00%)6528 (-6.62%)
Original mobile 3Ghttp://wpt.wmftest.org/result/160218_M4_BW/4291870078574552
OOUI mobile 3Ghttp://wpt.wmftest.org/result/160218_R3_BX/4290 (-0.02%)8500 (-2.3%)9818 (+24.96%)4506 (-1.01%)
Original mobile 2Ghttp://wpt.wmftest.org/result/160218_Y5_BY/11997243002218612536
OOUI mobile 2Ghttp://wpt.wmftest.org/result/160218_91_BZ/12193 (+1.63%)21200 (-12.76%)26860 (+21.07%)12552 (+0.12%)

These numbers are very surprising to me.

On desktop, Load time is significantly down, despite the number of bytes shipped being higher; Start render is improved but only on Chrome, and Visually complete and SpeedIndex are both improved too.

On mobile, Load time is significantly up (far more than I expected), but Start render and SpeedIndex are essentially flat, and Visually complete is much better(?) on 2G.

I'm lost. :-)

Ok, did a re-run again without the debug header. Again taking the median of 31 runs for SpeedIndex.

Taking the best run rather than the median run would be better, because the best run is the one that is least influenced by environmental noise.

You can take the fastest by adding ?medianRun=fastest&medianMetric=SpeedIndex to the run. So we sorted by the median of SpeedIndex so the other metrics are not the median, making difference in % isn't right. When we take the median for SpeedIndex, thats the only metric we should compare, sorry for including the rest.

Lets make list of the fastest start render & Speed Index. However I think using fastest is not right, some metrics are reported wrong checkout:
http://wpt.wmftest.org/result/160218_RX_BS/15/details/

Start rendering is at 0.125 s but checking the screenshots for the run it really happens at 3.7 s.

Did a new run since it so easy using WPT Bulk tester. Running 51 times on each URL, taking the median start rendering time, Making so many runs _should_ filter out the noise or?

One other thing that looks wrong in some tests is that the first byte is extremely fast, hitting 0.051s sometimes.

51 runs focusing on median SpeedIndex looks better

Lets tests it even more.

I've created an issue for WebPageTest https://github.com/WPO-Foundation/webpagetest/issues/566

Seems it happens only for Chrome & cable, but it happens so often that our values gets screwed :(

Ok, got it confirmed that it's not working correctly in Chrome/WPT:

"The issue is coming from Chrome's dev tools reporting of the timings and it looks like it is missing the start of the request (including socket and DNS) for some reason. The real fix is for me to move away from using dev tools for the timings for HTTPS requests and I'm really close to having that working (though then it will be like Firefox and SPDY will be an issue)."

https://github.com/WPO-Foundation/webpagetest/issues/566

Nothing more todo here right.