Page MenuHomePhabricator

Homepage: discovery measurement
Closed, ResolvedPublic


The Growth team has a 2019Q2 goal of increasing the homepage discovery rate by 100%. According to our leading indicators analysis, without any new discovery features, the discovery rate (desktop only) in Czech Wikipedia is 30% and the discovery rate in Korean Wikipedia is 23%. Our goal is to increase that to 50-60%.

This task is about measuring the change in discovery rate, and about measuring the sources of discovery. Specifically:

  • After the first set of homepage discovery features have been deployed for about two weeks, we should look at the discovery rate of newcomers who joined after those features were deployed compares to the discovery rate of newcomers from before those features were deployed. We currently estimate that date will be August 7. That first set of features includes:
    • T222848: traffic from email confirmation success page
    • T225328: link from Special:Contributions (desktop)
    • T227575: link from Special:Contributions (mobile)
    • T222852: discovery of homepage after account creation (desktop)
  • We will want to see what percent of newcomers first visit their homepage from each of these sources, as well as from the existing discovery paths of personal tools link and tabs from the User or User Talk page.

Event Timeline

Restricted Application changed the subtype of this task from "Task" to "Deadline". · View Herald TranscriptJul 17 2019, 8:44 PM

Thanks @MMiller_WMF, looking forward for checking the data! I was just wondering why Discovery of homepage after account creation (mobile T224883 and noJs T225318) won't make it to the first round?

@Cntlsn -- those aren't in the list because the engineering hasn't been done on them. The components in the list are going to go out this week, and so we'll look at their impact as a cohort. Then we'll look again in a few more weeks when the remaining features to aid discovery have rolled out to see the additional impact from those.

Waiting three weeks before start this task.

kzimmerman triaged this task as Medium priority.Sep 11 2019, 9:49 PM
kzimmerman moved this task from Next Up to Doing on the Product-Analytics board.

I grabbed data from Czech and Korean Wikipedias for June, July, and August. Excluded from the analysis are auto-created accounts, known test accounts, and users who turned the Homepage on or off in their preferences. From this, I calculated a daily proportion of registrations that visited the Homepage within 48 hours after registration, and excluded the last two days of registrations to allow everyone equal opportunity.

In order to be able to compare these results with those previously reported in our leading indicators analysis, users who registered on mobile or visited the Homepage on mobile are also excluded. This is because the leading indicator analysis was only done on mobile registrations and users.

From this, we can create the following plot of this proportion, with a 7-day moving average added in red to show the trend:

homepage_48hour_visit_rate_with_ma.png (1×2 px, 315 KB)

The first dotted vertical line is the deployment of the link to the Homepage from Special:ConfirmEmail (12 July). The second dotted vertical line is the deployment of the link from Special:Contributions (26 July). These dates are based on when events from these first show up in the Data Lake. The link from the WelcomeSurvey appears to have been deployed the day before Special:Contributions.

Because of the staggered deployment and large window between them, I chose to use the two weeks prior to 12 July as the "before" period, and the two weeks after 26 July as the "after" period. A caveat with this method is that it does not take into consideration time effects, but based on the data we have I think this is the best approach that also doesn't complicate the analysis. Summing up all registrations and visits within 48 hours, we get the following table:

WikiBefore %After %

This translates to a 111.0% increase in Czech, and a 158.1% increase in Korean.

@nettrom_WMF -- thanks for pulling this together. I'm excited that we seem to have had the impact we were hoping for! But before we wrap this up, I think it is important to reconcile why the "Before %" in your analysis (16.4% and 9.4%) are so much lower than the corresponding numbers from the leading indicators (30% and 23%).

Also -- you wrote "I also only count users who registered on the desktop site and visit the Homepage on the desktop site as well." -- that doesn't mean that mobile users aren't counted, does it? Could you rephrase that to state who is excluded?


@MMiller_WMF : thanks for asking about why the numbers are different. I went back and had another look at the leading indicators, and then found that I'd forgotten to exclude the control group from the denominator in the current analysis. I'll update the numbers in a minute (spoiler alert: we're still above 100% increase on both wikis).

I'll also rephrase the exclusion of mobile users. There's two reasons for excluding them: 1) the leading indicator analysis was only done on desktop, and 2) as far as I remember the mobile version wasn't deployed until around the same time as the link from Special:ConfirmEmail, so the "before" and "after" groups would then be radically different.

Forgot the second part of this analysis. In this case, I'm using data from the final deployment (26 July) and 4 weeks onwards. All registrations are counted, and the visit_mobile flag below shows whether the first visit was on the mobile (True) or desktop (False) site. Again visits have to occur within 48 hours of registration. Percentages are calculated within each group (wiki and desktop/mobile).


As far as I'm concerned my part of this is now done, so reassigning to @MMiller_WMF for review.

That's great! Thanks, @nettrom_WMF. These numbers look good, and show that we achieved our Q1 goal of increasing the homepage discovery rate by 100%.