Page MenuHomePhabricator

Investigate Performance wins for Firefox Facebook page
Closed, ResolvedPublic

Description

We got a nice change the 24th for the Facebook page we test with Firefox:

All metrics are better for that page:

But nothing for the other two pages we test. We should investigate to see what made the decrease in metrics.

Event Timeline

Peter created this task.Mar 27 2017, 9:41 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 27 2017, 9:41 AM
Krinkle claimed this task.Apr 5 2017, 7:29 PM
Krinkle triaged this task as Medium priority.
Krinkle moved this task from Inbox to Next In This Quarter on the Performance-Team board.
Krinkle added a subscriber: Krinkle.
Krinkle added a comment.EditedApr 7 2017, 3:14 AM

https://grafana.wikimedia.org/dashboard/db/webpagetest?panelId=20&fullscreen&orgId=1&from=1489536000000&to=1491163200000


https://grafana.wikimedia.org/dashboard/db/webpagetest?panelId=20&fullscreen&orgId=1&from=1490180400000&to=1490486400000

Seems the change happened between 2017-03-23 20:45 and 2017-03-24 03:00. Here a some of the deployments logged between Thursday 2017-03-23 12:00 and Friday 2017-03-24 03:00:

  • 2017-03-23 19:11 <thcipriani@tin> rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.17
  • 2017-03-23 20:16 <thcipriani@tin> rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.17
  • 2017-03-23 23:12 <thcipriani@tin> Synchronized php-1.29.0-wmf.17/extensions/ORES: SWAT: [[gerrit:344555|Stats: Invert "false" thresholds so they are correct]]
  • 01:15 <catrope@tin> Started scap: Wikidata cherry-picks
  • 01:40 <catrope@tin> Finished scap: Wikidata cherry-picks (with i18n)
  • 02:15 <MaxSem> < Krinkle > (fixing outage)
  • 02:25 <Krinkle> All apaches are back up

Seems like the obvious trigger is the roll out of this weeks' MediaWiki changes (1.29.0-wmf.17). Looking through the commits in MediaWiki core (not looking at the extensions yet), the following stand out in terms of frontend behaviour that may have changed for plain page views.

  • ac817da014 - mediawiki.widgets: Remove autoEllipsis() from TitleOptionWidget (2 weeks ago) <Timo Tijhof>
  • f4329ddbc8 - resourceloader: Use SVG url when ResourceLoaderImageModule can't embed (3 weeks ago) <WMDE-Fisch>
  • 0638e8568f - Make infused and non-infused DateInputWidget consistent (4 weeks ago) <Geoffrey Mon>
  • 692bf795a9 - Skin: Only load jquery.makeCollapsible if needed (4 weeks ago) <Timo Tijhof>
  • 2fccd229d2 - Skin: Preload jquery.tablesorter based on rough heuristics (4 weeks ago) <Timo Tijhof>

Next step: Compare waterfall collected by WebPageTest from runs before/after the change.

PS: Navigation Timing from around the same time does not seem to have significantly changed so it might be specific to Firefox or certain aspects of this run (Facebook page, English Wikipedia, cable speed, geolocation). (See https://grafana.wikimedia.org/dashboard/db/navigation-timing?from=1489795200000&to=1490572800000, switch between the various metrics).

Krinkle added a comment.EditedApr 7 2017, 9:41 PM

Runs before the change alternate between SpeedIndex 2500 and 2800 on the graph. A few samples:

(ff-2017-03-23 17.48) 170323_X4_KB/4(ff-2017-03-23 18.48) 170323_RG_MJ/3(ff-2017-03-23 19.48) 170323_GR_NV/2
First Byte: 0.144sFirst Byte: 0.148sFirst Byte: 0.146s
Start Render: 2.595sStart Render: 2.599sStart Render: 2.400s
SpeedIndex: 2717SpeedIndex: 2721SpeedIndex: 2530
Load Time: 3.205sLoad Time: 3.194sLoad Time: 3.042s
Visually Complete: 4.858sVisually Complete: 4.837sVisually Complete: 4.731s
DOM Nodes: 11406DOM Nodes: 11406DOM Nodes: 11406
Requests (domComplete): 64Requests (domComplete): 64Requests (domComplete): 64
Bytes In (domComplete): 913 KBBytes In (domComplete): 914 KBBytes In (domComplete): 913 KB
Requests (fullyLoaded): 72Requests (fullyLoaded): 72Requests (fullyLoaded): 72
Bytes In (fullyLoaded): 1,030 KBBytes In (fullyLoaded): 1,031 KBBytes In (fullyLoaded): 1,030 KB

Runs after the change alternate between SpeedIndex 2200 and 2500 on the graph. A few samples:

(ff-2017-03-24 00.48) 170324_TV_F/2(ff-2017-03-23 23.48) 170323_3G_TD/5(ff-2017-03-23 22.48) 170323_C6_S8/2Difference
First Byte: 0.146sFirst Byte: 0.146sFirst Byte: 0.146s± 0.002
Start Render: 2.335sStart Render: 2.197sStart Render: 2.195s -0.2s
SpeedIndex: 2326SpeedIndex: 2328SpeedIndex: 2454 -300
Load Time: 2.996sLoad Time: 2.946sLoad Time: 3.006s± 0.1
Visually Complete: 4.381sVisually Complete: 4.341sVisually Complete: 4.795s -0.4s
DOM Nodes: 11509DOM Nodes: 11509DOM Nodes: 11509+103
Requests (domComplete): 64Requests (domComplete): 64Requests (domComplete): 64± 0 req
Bytes In (domComplete): 918 KBBytes In (domComplete): 918 KBBytes In (domComplete): 918 KB+5KB
Requests (fullyLoaded): 71Requests (fullyLoaded): 71Requests (fullyLoaded): 71 -1 req
Bytes In (fullyLoaded): 1,028 KBBytes In (fullyLoaded): 1,028 KBBytes In (fullyLoaded): 1,028 KB-2KB
Requests beforeRequests afterDifference
/wiki (1)/wiki (1)± 0
/w/load.php (9)/w/load.php (8) -1 request
/static/ (4)/static (4)± 0
/w/extensions/...png (1)/w/extensions/...png (1)± 0
upload.wikimedia.org (55)upload.wikimedia.org (55)± 0
meta.wikimedia.org (1)meta.wikimedia.org (1)± 0
typeload.php urlDifference
<link rel=stylesheetload.php?modules=[page styles: cite,tmh,uls,visualeditor,..]
<script asyncload.php?modules=startup
<link rel=stylesheetload.php?modules=site.styles
script fetchload.php?modules=jquery+mediawiki
script fetchload.php?modules=[lazy load: legacy.wikibits]
script fetchload.php?modules=[page scripts: tmh,uls,centralnotice,..]+tablesorter (now loads here)
script fetchload.php?modules=[page scripts (cache group): jquery.ui]
script fetchload.php?modules=[lazy load: TimedMediaHandler dependencies]
script fetchload.php?modules=[lazy load: tablesorter + mw.language] removed

The removed request was done in rMW2fccd229d2e8: Skin: Preload jquery.tablesorter based on rough heuristics (part of T159911: Audit modules (2017)). I tried hard to find some other difference but this seems to be the only thing that stands out.

The earlier loading of tablesorter explains:

  • Visually Complete being reached earlier since the loading of tablesorter adds little arrow icons inside some of the tables on the page. This process now starts earlier instead being blocked on a lazily discovered http request.
  • "Load Time" (domComplete/window.load) happening earlier because the lazy load for "tablesorter" used to happen inside a document ready handler (domInteractive). And any requests that start before domContentLoadedEventEnd will count as subresources and thus hold back the domComplete metric. For the same reason, the background request for login.wikimedia.org (as example) is intentionally delayed beyond document ready so that it doesn't cause the page load indicator in native browsers to be prolonged.
  • "Bytes In (domComplete)" being slightly larger (+4KB) as it now includes tablesorter.
  • "Bytes In (fullyLoaded)" being slightly smaller (-2KB, which includes the same 4KB still). Presumably because we no longer have the added overhead of another module request. It might also be that the code compresses slightly better now.

However none of this explains why it affected earlier metrics like "Start Render". Browsers don't have to wait for document ready (domInteractive) or window load (domComplete) to start painting something.

On the other side, the request that was removed used to start around 1.9s, and start render at 2.3s. So it certainly is "in time" to have been used as a heuristic of some sort in Firefox internally. While Firefox definitely doesn't wait for domComplete to render, it might be that it more-or-less arbitrarily decides to wait a small amount of time for certain requests to complete in case they significantly change the page (to avoid FOUC). It looks like it was doing that for our tablesorter request which Firefox may've seen as "important" for some reason.

Krinkle moved this task from Not yet to Write about next on the Performance-Team-publish board.
Krinkle closed this task as Resolved.Apr 12 2017, 8:50 PM