Page MenuHomePhabricator

Account creation: baseline constructive activation
Closed, ResolvedPublic

Description

In advance of the upcoming Fundraising mailing, we want to set key results that we're hoping to achieve. One of them is going to be the percent of donor-created-accounts that achieve constructive activation (i.e. make a first unreverted edit). The idea is that when we generate accounts, we want to make sure those accounts are actually adding value to the wikis, and are not just accounts that don't end up editing.

We might expect these donor-created-accounts to constructively activate as often or more often than our usual organic accounts, because these are donors who have taken many committed steps to Wikimedia already.

Some specifications on how we might want to calculate this baseline:

  • Both Spanish and Portuguese Wikipedias, averaged over a recent time period is convenient, like May or June 2021.
  • We would want it to be the number amongst our treatment group (those getting Growth features) as opposed to the control group.  That's because we are interested to know how these donors who are getting Growth features compare to the other population that is getting Growth features.
  • However we calculate it, we'll want to calculate it the same way after the Fundraising campaign, when we filter to just those accounts created through the campaign.

Event Timeline

The Growth features were deployed to the Spanish Wikipedia on 2021-04-21 (ref: T278235#7025299). I'd like to use MediaWiki history for this because it's generally accepted as the canonical dataset for user and editing data, and because that's what I've used for NEWTEA revisited so I have a data gathering pipeline that's easy to adapt. Based on the key results and other metrics in T286796, we don't really need editing data from August, so we can use the July snapshot of MediaWiki history (which will be available in early August).

Using the current snapshot of MediaWiki history would let us estimate this baseline for May 2021, because we require a full 24 hours of data into the next month. Doing it for June would require next month's snapshot, unless we ignore the last day. While we could do that, I'm not sure it'll provide us with any meaningful insights.

@MMiller_WMF : I propose that we estimate this based on the May 2021 data that's currently available in MediaWiki history.

@nettrom_WMF -- thanks for explaining. I think basing this on May 2021 will be just fine.

This analysis is completed and the notebook has been pushed to GitHub.

Using data from May and June 2021 (because the most recent MediaWiki history snapshot also allowed us to include June data), we find an estimated baseline of 27.0% across both Spanish and Portuguese Wikipedias. Split by wiki the proportions are 24.7% for Spanish, and 31.4% for Portuguese.

Hi @nettrom_WMF - would it possible to break out the baseline activation rates by those who got to the account creation page from editing context vs not? I'm thinking that those who arrived there from the prompt during a logged out edit would be far more likely to activate, and that may be a reason for the higher figure on ptwiki.

Hi @nettrom_WMF - would it possible to break out the baseline activation rates by those who got to the account creation page from editing context vs not? I'm thinking that those who arrived there from the prompt during a logged out edit would be far more likely to activate, and that may be a reason for the higher figure on ptwiki.

I thought we might be able to do that, but I don't think the current setup on ptwiki allows us to learn it. From what I understand, the way editing is limited is that they modify the interface so that the "Edit" tab links to a specific registration page. That page doesn't link back to the original article in a way that allows us to understand they wanted to edit it when the account is created and logged through EventLogging, whereas the standard warning to non-logged in users on desktop (e.g. en:MediaWiki:Anoneditwarning) and overlay on mobile both do (they set query parameters to return back to the article and open the editor). Instead, the user has to create their account (or log in), then click "edit" again on the article.

If the ptwiki community changed their custom "Edit" tab so it sets query parameters the same way the others do (or set a custom campaign query parameter), then we could track it like the other wikis.

Out of curiosity, I checked the proportions of self-made, non-API registrations that appeared to come from an editing context on Portuguese and Spanish in June. For Portuguese it's 1,162 out of 9,178 registrations (12.7%, so some slip through the cracks), while for Spanish it's 5,611 out of 17,041 (32.9%). Given that it's about 1/3 on Spanish, I suspect that even though they might activate at a higher rate it won't change the overall proportion that much.

I won't dig further into this at the moment, but it's been useful to learn that if we want to dig into this later based on the results, that's not possible on ptwiki.

Hi @nettrom_WMF - would it possible to break out the baseline activation rates by those who got to the account creation page from editing context vs not? I'm thinking that those who arrived there from the prompt during a logged out edit would be far more likely to activate, and that may be a reason for the higher figure on ptwiki.

I thought we might be able to do that, but I don't think the current setup on ptwiki allows us to learn it. From what I understand, the way editing is limited is that they modify the interface so that the "Edit" tab links to a specific registration page. That page doesn't link back to the original article in a way that allows us to understand they wanted to edit it when the account is created and logged through EventLogging, whereas the standard warning to non-logged in users on desktop (e.g. en:MediaWiki:Anoneditwarning) and overlay on mobile both do (they set query parameters to return back to the article and open the editor). Instead, the user has to create their account (or log in), then click "edit" again on the article.

If the ptwiki community changed their custom "Edit" tab so it sets query parameters the same way the others do (or set a custom campaign query parameter), then we could track it like the other wikis.

Out of curiosity, I checked the proportions of self-made, non-API registrations that appeared to come from an editing context on Portuguese and Spanish in June. For Portuguese it's 1,162 out of 9,178 registrations (12.7%, so some slip through the cracks), while for Spanish it's 5,611 out of 17,041 (32.9%). Given that it's about 1/3 on Spanish, I suspect that even though they might activate at a higher rate it won't change the overall proportion that much.

I won't dig further into this at the moment, but it's been useful to learn that if we want to dig into this later based on the results, that's not possible on ptwiki.

Thanks for digging further into this @nettrom_WMF and it's useful to find out that ptwiki has this separate page for catching anon edit sources that can affect our ability to track this. @MMiller_WMF this is a maybe relevant to note in discussions around how the IP Masking project could in future require us to change to how we measure entry points to the newcomer experience.

And as for the Spanish rate, that is surprisingly not that much higher than I expected! Agree it probably wont't change the overall proportion that much then to split out that group what with the higher motivation of the donor interested in editing you talked about as a balancing factor.

MMiller_WMF renamed this task from Donors to newcomers: baseline constructive activation to Account creation: baseline constructive activation.Feb 5 2022, 2:36 AM