Page MenuHomePhabricator

Positive Reinforcement: investigate difference in mobile activation
Closed, ResolvedPublic

Description

The leading indicators for the Levelling Up experiment show a similar trend in constructive activation for mobile registrations as we found in the New Impact module experiment: newcomers who get the New Impact module see a lower rate of constructive activation. We dug into this a bit in T330614: New Impact module's empty state on mobile: research spike and found that it wasn't consistent and was not reproducible in a second dataset.

We decided at that point to add Activation as a leading indicator for Levelling Up and follow up on it if the issue resurfaced. Since it has, we're filing this task to investigate.

In this case we want to see if there are patterns in activation for the treatment and control groups when it comes to:

  • Geography
  • Device type
  • User-Agent

Event Timeline

nettrom_WMF changed the task status from Open to In Progress.Apr 10 2023, 9:42 PM
nettrom_WMF moved this task from Next 2 weeks to Doing on the Product-Analytics (Kanban) board.

@nettrom_WMF - as discussed in our team meeting this week, can you look a little further into if we’re losing suggested activations or non-suggested activations? In other words, we want to see if the percentage of users who click through to suggested edits from the empty state is different between the two groups.

@Tgr - does that cover your question?

Yes, thanks.

If we don't trust click logging (activation stats are robust against ad blockers etc, click stats aren't), we could also just split users by whether their first edit is a suggested one, I think.

We've examined this phenomenon in several different ways looking for patterns, and not found any specific pattern that provides clues as to why this is happening. This analysis started out by looking at the three breakdowns suggested in the task, as well as by wiki since it was trivial to split the data that way too. Here's what we found:

  • Wiki: Bangla Wikipedia is different from the other three in that users who get the new impact module are slightly more likely to activate.
  • Geolocation: no particular pattern by country within each wiki.
  • Android & iOS: users with either OS show the same drop in activation, and there are no differences by OS version.
  • Browsers: no difference between the major browsers, and various versions also show the same pattern.

We've also investigated whether newcomers activate through suggested edits or not, and did not find a significant difference between experiment groups. One thing we did learn was that newcomers who activate by making a constructive article edit tend to do so on their first edit!

Lastly, we investigated whether there was a temporal pattern associated with the data center switchover in late February/early March. We did not find any suggestion that activation was affected by that either.

As far as this task is concerned, we've exhausted the obvious places to look for patterns and will shift our attention to other tasks. Moving this to "Needs Sign-off" on the Product Analytics board for @KStoller-WMF to review and comment in case I've missed something, otherwise this task can be resolved.

KStoller-WMF added a subscriber: kostajh.

Thanks, @nettrom_WMF, I agree that we can consider this resolved as I'm not sure what else we can investigate in the data we have currently.
@kostajh plans to perform some client-side and server-side profiling. Hopefully he'll find something or have suggestions for what we can test next.

I created a user account "KHarlan Test New Impact‬", with no contributions, and used ge.utils.setUserVariant('oldimpact') to switch to the non-Vue interface, and ge.utils.setUserVariant('control') to switch to the Vue interface. I used Firefox 114, and used the Network inspector to disable HTTP caching. Then I reloaded the page and recorded results.

Note, in the results, the terms load and DOMContentLoaded have the following meanings:

DOMContentLoadedThe DOMContentLoaded event fires when the initial HTML document has been completely loaded and parsed, without waiting for stylesheets, images, and subframes to finish loading.docs
loadThe load event is fired when the whole page has loaded, including all dependent resources such as stylesheets, scripts, iframes, and images.docs

load is the more important event from the end-user perspective.

ScenarioHTTP request countData transfer (kB) DOMContentLoadedload
Desktop, Vue435032.21s3.53s
Desktop, old impact56398966ms1.31s
Mobile, Vue27415496ms628ms
Mobile, old impact276621.97s2.05s

Some observations

  • Old impact module on desktop makes more HTTP requests
    • Follow-up: identify what these additional requests are about
  • Desktop, old impact: Loads faster and transfers less data than Vue, as expected.
    • Follow-up: It is surprising that the Vue implementation is more than double the load time for the non-Vue implementation. Figure out where/what/why
  • Mobile: Vue implementation transfers less data and has significantly faster load time than non-Vue implementation
    • Follow-up: Understand why this is the case

I'm reopening this after we discussed this in light of the recent deployment of the New Impact Module to additional wikis in T336203. There's a difference in how the Suggested Edits module behaves on desktop and mobile that we're interested in understanding. On desktop, users have to initialize the module before getting any suggestions, whereas on mobile the module is already initialized. Timeboxed to one hour to grab fresh data and investigate this four the pilot wikis, with an additional hour if we want to expand it to the nine other Wikipedia wikis we're currently deployed to.

I've taken a first look at this by investigating two proportions. First, the proportion of newcomers with constructive article activation. Second, the proportion of those activating through Suggested Edits. While that doesn't specifically answer if users are able to find Suggested Edits and/or activate it, it should give us an indication of whether we are running into issues.

I grabbed data from our four pilot wikis to get an initial sense, and then we can follow up further as needed. The table below shows the numbers and proportions. Generally, the trend is: users with the old impact module are more likely to activate, and more likely to do so through Suggested Edits. Czech desktop registrations are different when it comes to the activation proportions, and Spanish mobile registrations are different when it comes to Suggested Edits. Note that none of these differences in activation proportions are statistically significant (that goes for overall by platform as well, and I didn't check the Suggested Edits proportions).

@KStoller-WMF : What would the next step be here? Wait for T337314 to have been deployed for two weeks and then check the numbers again?

PlatformWikiGroupNumber of registrationsPercent activatedPercent suggested edits
DesktopArabicNew Impact2,43615.1%6.0%
DesktopArabicOld Impact2,44715.6%7.6%
DesktopBanglaNew Impact41912.9%5.6%
DesktopBanglaOld Impact36917.6%6.2%
DesktopCzechNew Impact1,10330.0%6.0%
DesktopCzechOld Impact1,03828.09.3%
DesktopSpanishNew Impact9,82422.0%9.2%
DesktopSpanishOld Impact9,85022.1%9.8%
MobileArabicNew Impact7,34410.9%14.2%
MobileArabicOld Impact7,11811.2%15.9%
MobileBanglaNew Impact1,35018.7%16.7%
MobileBanglaOld Impact1,31918.8%18.1%
MobileCzechNew Impact59812.2%8.2%
MobileCzechOld Impact57015.4%18.2%
MobileSpanishNew Impact7,75218.0%12.5%
MobileSpanishOld Impact7,78718.5%10.7%
KStoller-WMF closed this task as Resolved.EditedJun 6 2023, 12:01 AM

Thank you, @nettrom_WMF! I think we can consider this task resolved for now.

@kostajh do you think this data indicates that the issue might be related to the topic selection / newcomer experience on desktop?

I've taken a first look at this by investigating two proportions. First, the proportion of newcomers with constructive article activation. Second, the proportion of those activating through Suggested Edits. While that doesn't specifically answer if users are able to find Suggested Edits and/or activate it, it should give us an indication of whether we are running into issues.

Thanks @nettrom_WMF!

@KStoller-WMF : What would the next step be here? Wait for T337314 to have been deployed for two weeks and then check the numbers again?

My 2c is that T337314: [BUG] New impact empty module is loading without CTA and with mobile headers is unlikely to have any significant effect in what we are seeing here. IIRC last time we looked at mobile-details mode, it was in use by ~1% of mobile views to Special:Homepage.