Page MenuHomePhabricator

Account creation: social media campaign effectiveness analysis
Closed, ResolvedPublic

Description

Similarly to previous pilot work that involved landing pages (e.g. T286796), we want to produce funnel numbers for this campaign. There was a treatment landing page and a control landing page, so we want to have one set of numbers for each. These aggregate counts results will be combined with the counts from the marketing vendor about the top parts of the funnel (ads served, clicks) to create a picture of the full funnel.

  • The campaign was only on eswiki.
  • The campaign ran from May 3 to May 31.
  • On May 18, the ad spend was changed from 90/10 (treatment/control) to 70/30.

The other tasks in the epic contain the URLs and parameters that indicate the treatment and control landing pages. These are the numbers we need, broken out by treatment and control.

  • Clickthroughs (approximated through page loads of the landing pages by distinct IP addresses)
  • Account creations
  • Constructive activations

Event Timeline

nettrom_WMF renamed this task from Account creation: analyze campaign effectiveness to Account creation: analyze social media campaign effectiveness.May 24 2022, 4:57 PM
nettrom_WMF renamed this task from Account creation: analyze social media campaign effectiveness to Account creation: social media campaign effectiveness analysis.
mpopov triaged this task as Medium priority.May 24 2022, 5:08 PM
mpopov moved this task from Triage to Current Quarter on the Product-Analytics board.

@EdErhart-WMF -- before we start this analysis, could you please add a comment with this information:

  • The exact dates of the campaign.
  • The exact dates of when any of the treatment/control ratios were changed, or if the campaign ramped up or down over time.

And could you look over the task description and note whether there is anything you think we need that you don’t see? Thank you!

Hey folks,

The campaign ran from 3 May to 31 May. I'm not sure of the exact times on those days where things were switched on/off.

I believe we changed the ratio from 90/10 treatment/control to 70/30 on 18 May, which I'll confirm ASAP, but the rest remained consistent.

The first pass off this analysis is now complete. We've gathered data from the Spanish Wikipedia from May 3–31 for the two campaigns social-latam-2022-A and social-latam-2022-B. In this analysis we focus on actions taken on the mobile platform. For more details about the data gathering methodology and why desktop platform actions are disregarded, see the notes below.

We get the following table of data from the mobile platform for these two campaigns:

CampaignLanding page viewsLanding page usersNumber of registrationsRegistration %Number of constructive activationsConstructive activation %
A220,198183,1871430.078107.0%
B66,43853,238280.05313.6%

The difference in registration rate between the campaigns is not significant (X^2 = 3.358, df = 1, p = 0.067). The number of constructive activations is too low to draw any conclusions about a difference in activation rate, and so we're not even going to try.

Methodological notes:

  • For views of the Special:CreateAccount page (Spanish: Especial:Crear una cuenta) we limit the data to views with agent_type = "user". Or in other words, we disregard all views labelled as "spider" in the webrequest dataset.
  • We have some views and registrations on the desktop platform in the dataset, but these are a fraction of the mobile traffic and result in too few registrations to draw any meaningful conclusions. Given the low volume of traffic they do not make an impact on the conclusions drawn from analyzing only the mobile platform. It's because of these reasons that we choose to disregard the desktop platform in the analysis.
  • Account registrations are identified using the ServerSideAccountCreation schema for registrations done through the campaign, with known test accounts, autocreated accounts, and API accounts (which are typically app accounts) removed.
  • We verify account registration timestamps using MediaWiki's database and restrict the analysis to accounts registered while the campaign was running.
  • Constructive activation is defined as having made at least one edit within 24 hours of registration, with that edit not getting reverted within 48 hours of being made.

@EdErhart-WMF - I'll mark this as resolved, but feel free to reopen this task if any further data or analysis is needed related to this campaign.

I'm reopening this task as we're looking into explanations for differences in counts reported from the social media provider and our analysis.

The roughly 286k page views reported for the landing pages on mobile were roughly 50k lower than what we'd expect based on reports from our social media provider. I dug into various resources to gain an understanding of possible sources for this difference, paying particular attention to how we use wmf.webrequest as our data source since that also affects two other analysis that are currently ongoing (T308774 and T308784).

One identifiable source for the difference is that our desktop site redirects users who appear to be on a mobile device to our mobile site. For this particular campaign we find a large number of desktop requests with an HTTP 302 response. The vast majority of these have an accompanying request to the mobile site shortly afterwards with an HTTP 200 response, which is then counted correctly as a page view.

Using a week's worth of data from May I got a rough estimate of how many desktop responses didn't go this way, finding roughly 5k cases. Scaling up to the four weeks the campaign lasted it's 21.5k, which adds 7.5% to the total. While I could get a more precise estimate I'm fairly certain that would still result in a proportion that's enough for us to come away with the same key takeaway: in future campaigns we want to make sure that users on mobile devices get a link directly to the mobile site, and vice versa for desktop users.

With regards to the remainder of the difference, we can attribute some of it to requests never reaching us in the first place. Spending more time digging further to find additional explanations is not in scope, as the primary outcome of this quality assurance work is that our page view definition for these campaign analyses is sound.