Page MenuHomePhabricator

Welcome survey: change experiment to include control group
Closed, ResolvedPublic

Description

Given the high abandonment rates of the welcome survey in Vietnamese Wikipedia (see T216668), we may want to change the experiment running in that wiki from the current state (half of newcomers get Var A and half get Var C) to a new state (half of newcomers get Var A and half get control).

We thought we had enough evidence from Czech and Korean Wikipedias that the surveys did not cause abandonment problems that all that was left to do was decide which survey variant to use going forward. Since the abandonment rate on desktop in Vietnamese Wikipedia is 46%, we now may want to compare that against a control group to see if the survey is causing it.

And the reason that the new experiment would be Var A vs Control instead of Var C vs Control is that we have seen that Var A outperforms Var C on every metric in all wikis and on both platforms.

We will do this task depending on whether the Vietnamese community has insight into this issue (T216668), which we hope to gain during the next week.

Event Timeline

@nettrom_WMF -- EditorJourney was deployed in Vietnamese on Jan 16 and the welcome survey was deployed on Jan 24. Could we use the period in between to get a sense of the abandonment rate of users who don't get a survey (even though it's not a real A/B test because of the temporal change?)

This is a small configuration change. The last opportunity to to it this week is today (Feb 21) at 11 PST.

Change 491975 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[operations/mediawiki-config@master] Welcome survey: add a control group to viwiki

https://gerrit.wikimedia.org/r/491975

@nettrom_WMF -- EditorJourney was deployed in Vietnamese on Jan 16 and the welcome survey was deployed on Jan 24. Could we use the period in between to get a sense of the abandonment rate of users who don't get a survey (even though it's not a real A/B test because of the temporal change?)

Yeah, I think we can use that week of data to get an indication of what the rate is. I would still like to run the A/B test of Var A vs Control to get a better reading on it, though.

Just recording that we decided that @nettrom_WMF is going to run that quick analysis. We'll decide next week whether/when to switch the experiment over.

Should this be on the current workboard @MMiller_WMF ?

I've completed a quick analysis of the abandonment rate using the eight days of of data we have between the deployment of EditorJourney (on 2019-01-16) and the Welcome Survey (on 2019-01-24). In this analysis, I used the same approach that I used for the control group during the initial Welcome Survey experiment: does the user have more than three events[1] logged in the EditorJourney data? Split by whether the account was created on the mobile or desktop site, for Vietnamese the result is:

Is mobile?Abandoned?N%
NoNo38781.5
NoYes8818.5
YesNo25484.9
YesYes4515.1

With regards to the rate on mobile there doesn't seem any reason to be alarmed, it's similar to what I found in the Czech and Korean WP's control groups during the first experiment (13.1% and 15.8%, respectively). On desktop, the rate is much higher than what I found for the other two wikis: more than three times that of Czech (5.9%), and more than twice that of Korean (7.8%). However, it is still much lower than what we've seen during the Welcome Survey experiment.

I'm still of the opinion that an A/B test of survey vs control for Vietnamese will provide us with good information. It also seems clear that we would want to run that first whenever we deploy on a new Wikipedia since it's difficult to judge reactions to the survey.

Footnotes:

  1. Three events are logged during account creation: the account creation itself, the user being logged in through CentralLogin, and the return to the account creation context.

@SBisson -- @nettrom_WMF's analysis shows that we should switch the experiment to Var A vs. Control. We're ready for you to do that as soon as convenient for you. Please let us know the timestamp of the switch.

We'll also be interested to hear what the community says to @Trizek-WMF (T216668), but we might as well start the experiment.

@SBisson -- @nettrom_WMF's analysis shows that we should switch the experiment to Var A vs. Control. We're ready for you to do that as soon as convenient for you. Please let us know the timestamp of the switch.

We'll also be interested to hear what the community says to @Trizek-WMF (T216668), but we might as well start the experiment.

This is scheduled for tomorrow (Feb 27) at 9AM PST.

@nettrom_WMF -- could you please check today that the right experiment conditions are being logged?

I can also make some accounts and check.

Change 491975 merged by jenkins-bot:
[operations/mediawiki-config@master] Welcome survey: add a control group to viwiki

https://gerrit.wikimedia.org/r/491975

Mentioned in SAL (#wikimedia-operations) [2019-02-27T17:37:49Z] <niharika29@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Welcome survey: add a control group to viwiki T216669 (duration: 00m 54s)

viwiki now give variation A to 50% and no survey to the other 50% of new local accounts.

I just created two accounts in Vietnamese. One got Variation A and one got no survey, so I think the assignment is likely working.

@nettrom_WMF -- could you please verify in the data?

It's been about 7 hours since this went live, and using data from the replicated database, I get the following overview:

num_controlnum_surveynum_registrationsprop_controlprop_survey
15173246.953.1

Looks good to me!

For documentation, here's the SQL query:

SELECT SUM(IF(up_value LIKE "%group___NONE%", 1, 0)) AS num_control,
SUM(IF(up_value LIKE "%group___exp2_target_specialpage%", 1, 0)) AS num_survey,
COUNT(1) AS num_registrations,
ROUND(100*SUM(IF(up_value LIKE "%group___NONE%", 1, 0))/COUNT(1), 1) AS prop_control,
ROUND(100*SUM(IF(up_value LIKE "%group___exp2_target_specialpage%", 1, 0))/COUNT(1), 1) AS prop_survey
FROM user
JOIN user_properties
ON user_id=up_user
JOIN logging
ON user_id=log_user
WHERE user_registration > "20190227173749" -- From T216669#4988771
AND up_property = "welcomesurvey-responses"
AND log_type = "newusers"
AND log_action = "create";

Great. Thanks @SBisson and @nettrom_WMF. We can check on the abandonment rates next week.