Page MenuHomePhabricator

Fivefold increase in render-blocking CSS size for logged-in users due to Echo loading OOUI on all page views
Closed, ResolvedPublic

Description

I noticed page loads felt a bit sluggish after the 1.26wmf22 deploy so I poked around. The CSS in <head> seemed a lot larger than I remembered it being. Unfortunately, the two sources we have for tracking the size of static asset payload size over time only measure anonymous page-views. So I rolled back Echo to the 1.26wmf21 branch on testwiki and compared the size of CSS in <head> to its size when Echo is on 1.26wmf22. I found a fivefold increase in the size of uncompressed CSS.

1.26wmf211.26wmf22Percent increase
Uncompressed size73 kB378 kB418%
Compressed size14 kB43 kB231%

(Both measurements are relevant, because the performance impact of CSS is not just a function of the time it takes to retrieve it, but also of the time it takes to parse it and apply it to the DOM.)

This has had a substantial impact on page load times for logged-in users:

download.png (308×586 px, 26 KB)

I think the right thing to do is to revert and spend a bit of extra time working out the performance footprint before rolling this out again.

Echo is a wonderful extension which editors use and love, and it is very good to see it get a second-wind of development activity. Please do not feel discouraged or blamed.

The performance team is going to do a postmortem of this deployment to figure out how to ensure our monitoring and testing infrastructure flags such regressions in the future. Let's work together to make this a learning experience for us as an engineering community.

Related Objects

Mentioned In
T309793: Unexpected OOUI payload on page views (+70KB JS transfer size since 2022-04-14)
T118089: Test if there's a better way to catch size changes for logged in users in WPT
T118088: Half a megabyte of JavaScript / CSS loading on view pages
T109666: Set up WebPageTest for synthetic testing
T113677: Make OOjs UI in MediaWiki more lightweight, so that we can load it by default on every page without issue
rECHOc5905962abb2: Only load ext.echo.ui if the user clicks the echo badge
rMEXT0ffc7f33c71a: Updated mediawiki/extensions Project: mediawiki/extensions/Echo…
T112680: UrlShortener should not load all of oojs-ui on all pages
rMWb45a8922c87c: Revert Echo to 1.26wmf21 version
rMW3981c0a9a17f: Updated mediawiki/core Project: mediawiki/extensions/Echo…
rECHOf05e020c4838: Hack around OOUI's icon pack being too large by creating our own
rECHO5b1fc2b818be: Hack around OOUI's icon pack being too large by creating our own
rMEXTc4e73c74c768: Updated mediawiki/extensions Project: mediawiki/extensions/Echo…
T112557: Separate NavigationTiming data by deployment group
rMWfc81c2d42a5e: Updated mediawiki/core Project: mediawiki/extensions/Echo…
rECHO19d8c4adca02: Don't load oojs-ui.styles on every page
rECHOb046a1062d02: Don't load oojs-ui.styles on every page
rMEXTe441db5c7514: Updated mediawiki/extensions Project: mediawiki/extensions/Echo…
T112407: nojs/mw.echo.special.less styles should only be loaded on Special:Notifications
T112404: asset-check should measure asset size of logged-in page views
Mentioned Here
T113681: Split oojs-ui module into smaller ones
T125292: Optimize how we load OOUI in MediaWiki
T118088: Half a megabyte of JavaScript / CSS loading on view pages
rEVEDb550323b534b: Use mw.loader.using instead of weird hacks to load OOjs UI on action=edit
rECHO3940f523baf7: Only load nojs Special:Notifications styles on the special page
rECHOb046a1062d02: Don't load oojs-ui.styles on every page
rECHO5b1fc2b818be: Hack around OOUI's icon pack being too large by creating our own
rECHOf2ed7a6a5353: Don't load unused Echo JS/CSS on mobile
rECHO9e75474b5ff3: Split out .mw-echo-alert styles back into a separate module
rECHOc5905962abb2: Only load ext.echo.ui if the user clicks the echo badge
T109666: Set up WebPageTest for synthetic testing
T112404: asset-check should measure asset size of logged-in page views
T112347: Split OOjs UI's distribution .css files into code required by PHP version, and required by JS only

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I don't think this has much to do with Echo, except that we're now loading OOjs UI on every page view, which was a hit we were going to have to take at some point now that we're migrating more and more things over to it. T112347: Split OOjs UI's distribution .css files into code required by PHP version, and required by JS only is related as well.

Also, where is the graphite graph from? What are the relevant dashboards we're supposed to be looking at?

I don't think this has much to do with Echo, except that we're now loading OOjs UI on every page view, which was a hit we were going to have to take at some point now that we're migrating more and more things over to it. T112347: Split OOjs UI's distribution .css files into code required by PHP version, and required by JS only is related as well.

I understand that OOjs UI will be a dependency for other components that load by default, and not just Echo, but for the moment it is Echo that is causing this code to load, and it is Echo that would need to be rolled back to undo the performance regression. At any rate, I don't see the point in quibbling, except inasmuch as it helps pinpoint the problem.

(Also, while the bulk of the increase in CSS size is due to OOjs UI, there are fixes to be made in Echo proper too -- for example, I don't think it makes sense to /* embed */ Preferences.png and Help.png since neither are visible on page load.)

Also, where is the graphite graph from?

It's frontend.navtiming.totalPageLoadTime.desktop.authenticated.median.

A few of the relevant metrics are plotted in https://grafana.wikimedia.org/#/dashboard/db/navigation-timing (set '$Users' to 'authenticated'), but none of the views indicate the issue clearly, which is a problem.

What are the relevant dashboards we're supposed to be looking at?

https://grafana.wikimedia.org/#/dashboard/db/asset-check ought to have indicated this problem, but it didn't, because it doesn't track the static asset size of logged-in page views. I filed T112404 for that just now.

(T109666: Set up WebPageTest for synthetic testing will also help with this. @Krinkle and @Peter are working on having Jenkins trigger WebPageTest runs which will collect and report metrics to Graphite.)

download.png (308×586 px, 26 KB)

Also, where is the graphite graph from?

It's frontend.navtiming.totalPageLoadTime.desktop.authenticated.median.

A few of the relevant metrics are plotted in https://grafana.wikimedia.org/#/dashboard/db/navigation-timing (set '$Users' to 'authenticated'), but none of the views indicate the issue clearly, which is a problem.

The graph @ori posted here seems to use movingMedian (the raw graph is much more fluctuant). I've added a moving median to the Grafana dashboard (using 250 data points for now), and I've disabled the sticky "0" bottom value (so that the full height is used to amplify the difference over time).

It's pretty similar now:
https://grafana.wikimedia.org/#/dashboard/db/navigation-timing?from=now-14d&var-Users=authenticated

Screen Shot 2015-09-13 at 00.16.58.png (830×2 px, 168 KB)

Previously:

Screen Shot 2015-09-13 at 00.20.49.png (826×2 px, 167 KB)

Are there any objections to a temporary revert?

ori added a subscriber: Catrope.

@Catrope, could you please discuss this during your daily stand-up and work out a plan? My hunch is that this is too big to fix in-flight and that it would be better to pull this back and iterate on it before shooting for redeployment. What do you guys think?

@Catrope, could you please discuss this during your daily stand-up and work out a plan? My hunch is that this is too big to fix in-flight and that it would be better to pull this back and iterate on it before shooting for redeployment. What do you guys think?

I will look into this immediately.

If this really does come from OOUI CSS, that should be easy to fix, because I'm pretty sure we don't need that in a render-blocking place at all.

Change 238115 had a related patch set uploaded (by Catrope):
Don't load oojs-ui.styles on every page

https://gerrit.wikimedia.org/r/238115

Change 238115 merged by jenkins-bot:
Don't load oojs-ui.styles on every page

https://gerrit.wikimedia.org/r/238115

Change 238160 had a related patch set uploaded (by Catrope):
Don't load oojs-ui.styles on every page

https://gerrit.wikimedia.org/r/238160

Change 238160 merged by jenkins-bot:
Don't load oojs-ui.styles on every page

https://gerrit.wikimedia.org/r/238160

Patch was deployed at 16:12 UTC

Patch was deployed at 16:12 UTC

Much better, but there is still a significant increase in asset size. Compare the following two WebPageTest runs for https://test.wikipedia.org/wiki/Main_Page:

Test IDEcho branchBytes in (first view)Bytes in (repeat view)
150914_3A_10KD1.26wmf21540 KB291 KB
150914_X2_107V1.26wmf22645 KB343 KB

Change 238239 had a related patch set uploaded (by Catrope):
Hack around OOUI's icon pack being too large by creating our own

https://gerrit.wikimedia.org/r/238239

Change 238308 had a related patch set uploaded (by Catrope):
Don't make oojs-ui depend on oojs-ui.styles.icons

https://gerrit.wikimedia.org/r/238308

Change 238239 merged by jenkins-bot:
Hack around OOUI's icon pack being too large by creating our own

https://gerrit.wikimedia.org/r/238239

Change 238310 had a related patch set uploaded (by Legoktm):
Hack around OOUI's icon pack being too large by creating our own

https://gerrit.wikimedia.org/r/238310

Change 238310 merged by jenkins-bot:
Hack around OOUI's icon pack being too large by creating our own

https://gerrit.wikimedia.org/r/238310

There is still a 140 kB (compressed) difference in the size of static assets served to logged-in and non-logged-in users, which is not cool. mw.loader.inspect() on enwiki shows oojs-ui.* as being one and a half times larger than all other JavaScript modules combined.

Please take a step back, take a deep breath, and revert this until these issues can be fixed comprehensively and thoughtfully.

Change 238371 had a related patch set uploaded (by Legoktm):
Revert Echo to 1.26wmf21 version

https://gerrit.wikimedia.org/r/238371

Change 238371 merged by jenkins-bot:
Revert Echo to 1.26wmf21 version

https://gerrit.wikimedia.org/r/238371

Legoktm renamed this task from Fivefold increase in render-blocking CSS size for logged-in users to Fivefold increase in render-blocking CSS size for logged-in users due to Echo loading OOUI on all page views.Sep 15 2015, 5:43 PM
Legoktm set Security to None.

Change 238389 had a related patch set uploaded (by Legoktm):
Only load ext.echo.ui if the user clicks the echo badge

https://gerrit.wikimedia.org/r/238389

Change 238389 had a related patch set uploaded (by Mooeypoo):
Only load ext.echo.ui if the user clicks the echo badge

https://gerrit.wikimedia.org/r/238389

Change 238389 merged by jenkins-bot:
Only load ext.echo.ui if the user clicks the echo badge

https://gerrit.wikimedia.org/r/238389

Recap of fixes:

On every page we now load (→ indicates dependencies):

  • ext.echo.initext.echo.loggeroojsschema.EchoInteraction - sets up click handler to load the rest of the flyout
    • And now that I reviewed this @Mooeypoo and I figured out how to remove the logger dependency.
  • ext.echo.nojs - styles for badge icon and the flyout.
    • The flyout styles could be further split into a separate module that ext.echo.ui and Special:Notifications depend on/use.
  • ext.echo.badgeicons - Contains 3 embedded images, 2 of which are used on every page view, bellOn is only used if you have unseen alerts. (I think at least)

If you have a new talk page message, we will additionally load the ext.echo.nojs.alert styles that make your user talk link orange.

Legoktm added a subscriber: Mooeypoo.
  • ext.echo.initext.echo.loggeroojsschema.EchoInteraction - sets up click handler to load the rest of the flyout
    • And now that I reviewed this @Mooeypoo and I figured out how to remove the logger dependency.

https://gerrit.wikimedia.org/r/#/c/238835/

  • ext.echo.nojs - styles for badge icon and the flyout.
    • The flyout styles could be further split into a separate module that ext.echo.ui and Special:Notifications depend on/use.

https://gerrit.wikimedia.org/r/#/c/238846/

Re-assigning to Ori for further review.

Nice work. Thanks for tackling this so quickly.

Checked/compared(and general overview via Timeline and Network) on betalabs and production.

Below are typical numbers for the longest page in betalabs and Flow page.

**Beta**

Not logged in -- Long article- Israel
DomContentLoaded 6.77
110.076 ms Loading
1.51 s Scripting
1.03 s Rendering
191.713 ms Painting
1.09 s Other

Logged in -- Long article- Israel
DomContentLoaded 7.89

117.090 ms Loading
1.29 s Scripting
859.276 ms Rendering
153.422 ms Painting
1.34 s Other

Not logged in -- Talk:Flow QA
DomContentLoaded 2.08

49.501 ms Loading
1.04 s Scripting
136.368 ms Rendering
69.727 ms Painting
127.228 ms Other

Logged in -- Talk:Flow QA
DomContentLoaded 2.03

54.477 ms Loading
1.14 s Scripting
149.304 ms Rendering
112.380 ms Painting
170.735 ms Other

Change 238308 abandoned by Catrope:
Don't make oojs-ui depend on oojs-ui.styles.icons

https://gerrit.wikimedia.org/r/238308

And we're back!

On the main page of enwiki while logged in, I get:

oojs-ui227.5 kB
oojs-ui.styles.icons128.6 kB
oojs-ui.styles95.5 kB
oojs-ui.styles.icons-editing-core27.9 kB
oojs-ui.styles.indicators24.8 kB

@matmarex was able to pinpoint the cause -- VE and its 'ext.visualEditor.switching' module. Regression introduced in rEVEDb550323b534b: Use mw.loader.using instead of weird hacks to load OOjs UI on action=edit.

This has nothing to do with Echo, as the current subject of the task implies, but I think it makes sense to re-use this task because the issue is fundamentally the same: massive library, commonly used, easy to load without realizing it.

ori renamed this task from Fivefold increase in render-blocking CSS size for logged-in users due to Echo loading OOUI on all page views to Massive increase in render-blocking CSS size for logged-in users due OOUI.Nov 7 2015, 6:22 PM
Krenair renamed this task from Massive increase in render-blocking CSS size for logged-in users due OOUI to Fivefold increase in render-blocking CSS size for logged-in users due to Echo loading OOUI on all page views.Nov 7 2015, 7:11 PM

(For anybody browsing historical bugs, we're now working to slim down OOjs UI so that loading it on all pages is not a catastrophic performance regression. See T113681 and T125292 for details and progress.)