Page MenuHomePhabricator

Positive reinforcement: Investigate client-side and server-side performance characteristics of new impact module
Closed, ResolvedPublic

Description

Motivated by T334411: Positive Reinforcement: investigate difference in mobile activation, let's have a closer look at the client-side and server-side performance characteristics of the new impact module, so that we can understand if that has something to do with differences in activation.

Client-side (from initial analysis in T336198#8834652):

  • Document what these additional requests are about
  • it is surprising that the Vue implementation is more than double the load time for the non-Vue implementation. Figure out where/what/why
  • Document why the Vue implementation results in less data transfer and significantly faster load time than non-Vue implementation

Server-side:

  • Run Excimer for Special:Homepage as a user opted-in to control and oldimpact groups, and compare

Related Objects

StatusSubtypeAssignedTask
OpenKStoller-WMF
ResolvedKStoller-WMF
Declined Cntlsn
Resolved mewoph
Resolved mewoph
Invalid Cntlsn
DeclinedNone
DeclinedBUG REPORTNone
ResolvedSgs
ResolvedTgr
ResolvedSgs
DeclinedNone
DeclinedNone
DeclinedNone
ResolvedSgs
Resolved kostajh
Resolved kostajh
ResolvedSgs
ResolvedSgs
ResolvedSgs
ResolvedSgs
ResolvedSgs
ResolvedSgs
Resolved kostajh
Resolved kostajh
DeclinedSgs
Resolved kostajh
Resolved KieranMcCann
ResolvedBUG REPORT kostajh
ResolvedSgs
ResolvedSgs
ResolvedSgs
DeclinedNone
Resolved kostajh
ResolvedTgr
ResolvedTgr
ResolvedTgr
ResolvedTgr
ResolvedTgr
Resolved kostajh
ResolvedSgs
Resolved kostajh
ResolvedSgs
Resolved kostajh
ResolvedSgs
Resolved kostajh
ResolvedBUG REPORTTgr
ResolvedSgs
DeclinedRHo
ResolvedSgs
ResolvedSgs
ResolvedSgs
Resolved kostajh
Resolved kostajh
ResolvedTrizek-WMF
ResolvedSgs
ResolvedTgr
Resolved kostajh
Resolved kostajh
ResolvedTgr
ResolvedBUG REPORTSgs
Resolved kostajh
Resolved kostajh
ResolvedBUG REPORTEtonkovidova
Resolved kostajh
DeclinedSgs
Resolved kostajh
ResolvedSgs
ResolvedBUG REPORT kostajh
Resolvednettrom_WMF
ResolvedBUG REPORTTgr
ResolvedPRODUCTION ERROR kostajh
ResolvedBUG REPORTTgr
ResolvedBUG REPORTTgr
ResolvedPRODUCTION ERRORTgr
DeclinedNone
ResolvedBUG REPORTSgs
Resolved kostajh
Resolved kostajh
DeclinedNone
Resolved kostajh
ResolvedSgs
ResolvedSgs
ResolvedBUG REPORTSgs
ResolvedBUG REPORTSgs
Resolvednettrom_WMF
InvalidNone
ResolvedBUG REPORTSgs
Resolved kostajh
ResolvedNov 1 2023Urbanecm_WMF
ResolvedUrbanecm_WMF
ResolvedUrbanecm_WMF
ResolvedPRODUCTION ERRORUrbanecm_WMF
ResolvedPRODUCTION ERRORUrbanecm_WMF
ResolvedPRODUCTION ERRORUrbanecm_WMF
DuplicateBUG REPORTNone
ResolvedUrbanecm_WMF
ResolvedKStoller-WMF
ResolvedSgs
Resolvednettrom_WMF
ResolvedBUG REPORT kostajh
DeclinedNone
OpenNone

Event Timeline

kostajh renamed this task from Positive reinforcment: Investigate client-side and server-side performance characteristics of new impact module to Positive reinforcement: Investigate client-side and server-side performance characteristics of new impact module.May 8 2023, 7:00 PM

I created a user account "KHarlan Test New Impact‬", with no contributions, and used ge.utils.setUserVariant('oldimpact') to switch to the non-Vue interface, and ge.utils.setUserVariant('control') to switch to the Vue interface. I used Firefox 114, and used the Network inspector to disable HTTP caching. Then I reloaded the page and recorded results.

Note, in the results, the terms load and DOMContentLoaded have the following meanings:

DOMContentLoadedThe DOMContentLoaded event fires when the initial HTML document has been completely loaded and parsed, without waiting for stylesheets, images, and subframes to finish loading.docs
loadThe load event is fired when the whole page has loaded, including all dependent resources such as stylesheets, scripts, iframes, and images.docs

load is the more important event from the end-user perspective.

ScenarioHTTP request countData transfer (kB) DOMContentLoadedload
Desktop, Vue435032.21s3.53s
Desktop, old impact56398966ms1.31s
Mobile, Vue27415496ms628ms
Mobile, old impact276621.97s2.05s

Some observations

  • Old impact module on desktop makes more HTTP requests
    • Follow-up: identify what these additional requests are about
  • Desktop, old impact: Loads faster and transfers less data than Vue, as expected.
    • Follow-up: It is surprising that the Vue implementation is more than double the load time for the non-Vue implementation. Figure out where/what/why
  • Mobile: Vue implementation transfers less data and has significantly faster load time than non-Vue implementation
    • Follow-up: Understand why this is the case
kostajh updated the task description. (Show Details)

Not sure if I picked the right epic for this; please move around as needed.

The numbers collected in T336198#8834652 do not seem to match the ones collected by @egardner in T337320#8943518 and the ones from my own tests, which resemble to Eric's results. Below are some figures resulting of an average of 4-5 runs with Chrome's network tab and lighthouse report.

DesktopMobile
OldNewOldNew
First Contentful Paint (s)0.40.61.31.1
Largest Contentful Paint (s)1.11.22.73.0
Number of requests66623837
Onload (ms)9391060723779
Size of GrowthExperiments module & dependencies (kB)369532310446

The only consistent indicators across platforms and runs are the data transfer size and LCP score. We can clearly state that these two metrics have worsen from the old version to the new one.

Document what these additional requests are about

My guess is that the disparity in number of requests between our tests is due to different environment and run conditions. The randomness of the suggested task can cause discrepancy in the images requested. Another thing that was causing a 6-off difference between my tests runs are the POST /events requests that are not relevant to this issue.

It is surprising that the Vue implementation is more than double the load time for the non-Vue implementation. Figure out where/what/why

In my tests the figures were much closer. I see some disparity in cached resources despite of having HTTP caching disabled. Is this difference consistent accross your tests @kostajh?

Document why the Vue implementation results in less data transfer and significantly faster load time than non-Vue implementation

In my tests the data transfer from the Vue version was always bigger in both desktop and mobile. It's true that surprisingly some of the runs of the Vue version were faster at loading but in average the Vue one is still slower.

Run Excimer for Special:Homepage as a user opted-in to control and oldimpact groups, and compare

DesktopOld impact profileNew impact prfileNo relevant differences observed
MobileOld impact profileNew impact prfileNo relevant differences observed

I also run Chrome's experimental tool performance insights and it is marking the NewImpact createApp function as a "Long task run", which means it's taking longer than 50ms to execute in the main thread. Investigating further this does not seem worth since it would ideally result on a micro-optimization.

I think it could be

  1. something that causes a JS error in such a way that it breaks normal error logging (so we don't see it in logstash)
  2. more browsers being served the inferior no-JS version since Vue requires the es6 ResourceLoader flag which sets higher compatibility requirements (although this is infeasible because the effect size is too large)
  3. performance issue caused by larger payloads
  4. performance issue caused by Vue requiring more CPU etc
  5. some mistake in the GrowthExperiments application logic
  6. design differences between the Vue version and the old version (although they are tiny and the Vue version has the more call-to-action-y look)

I think we can try to optimize the current code to have an equal or similar payload to discard item (3) in the list; one thing that comes to my mind is avoid using moment library and use Intl instead. Not sure though if other components (eg: skins) are also loading moment; do you think this is worth a try @Tgr?

Aside from that I don't have clear action paths for the rest of items. I would try to fix any bug or minor improvement we have already identified for this module and keep working to understand the source of the issue and revert the activation decrease situation.

Sgs changed the task status from Open to In Progress.Jun 22 2023, 8:59 AM
Sgs updated the task description. (Show Details)
Sgs moved this task from In Progress to Code Review on the Growth-Team (Sprint 0 (Growth Team)) board.

(Sorry for missing the ping!)

I think we can try to optimize the current code to have an equal or similar payload to discard item (3) in the list; one thing that comes to my mind is avoid using moment library and use Intl instead. Not sure though if other components (eg: skins) are also loading moment; do you think this is worth a try @Tgr?

I think that's a good idea as a tech debt improvement (see also T146798: Replace Moment.js in core with something smaller (what?)). Specifically for the purposes of this investigation, though, I think making the two versions more dissimilar might cause more problems than it would solve.

Aside from that I don't have clear action paths for the rest of items. I would try to fix any bug or minor improvement we have already identified for this module and keep working to understand the source of the issue and revert the activation decrease situation.

Yeah, not sure. Maybe we could add a new experiment version which is "old impact page but looks more like the new one", make the same style adjustments like button color, set the es6 flag. That would eliminate two hypotheses, although the two least likely ones so not sure it's worth the effort.

I think that's a good idea as a tech debt improvement (see also T146798: Replace Moment.js in core with something smaller (what?)). Specifically for the purposes of this investigation, though, I think making the two versions more dissimilar might cause more problems than it would solve.

Filed T342237: Use Intl instead of moment, I agree introducing discrepancy between old/new is undesired but I don't understand why replacing moment with Intl in NewImpact would make it more dissimilar from the original Impact, given the original module didn't use moment or display any date data. Or by two versions you mean the existing NewImpact and an eventual moment-free version?

Yeah, not sure. Maybe we could add a new experiment version which is "old impact page but looks more like the new one", make the same style adjustments like button color, set the es6 flag. That would eliminate two hypotheses, although the two least likely ones so not sure it's worth the effort.

@KStoller-WMF @nettrom_WMF do you want to pursue any of the ideas mentioned in the conversation?

I don't understand why replacing moment with Intl in NewImpact would make it more dissimilar from the original Impact, given the original module didn't use moment or display any date data.

Oh, I thought we were using moment.js before (but of course the old version has server-side i18n). Yeah, removing it is a good next step then.

I don't understand why replacing moment with Intl in NewImpact would make it more dissimilar from the original Impact, given the original module didn't use moment or display any date data.

Oh, I thought we were using moment.js before (but of course the old version has server-side i18n). Yeah, removing it is a good next step then.

Sounds like we should move forward with this idea. Is there already a task for this?

If there are any other technical changes that you think are worth pursuing, please document them. But it sounds like there aren't any other easy wins to help with performance.

Next steps from my perspective:
@nettrom_WMF and I will take a secondary look at the Impact module experiment data after recent changes (https://phabricator.wikimedia.org/T342150#9028665). From that point we should make the call about performing an additional experiment or decide we are ready to release the Impact module to more wikis.

I think we can consider this investigative task resolved, correct?

I think we can consider this investigative task resolved, correct?

Right, I'll chime in T342150 or some Growth channel once T342237 is merged so it can be taken in account for any further analysis. cc @nettrom_WMF