Page MenuHomePhabricator

Donations/Membership Data Analysis
Closed, ResolvedPublic

Authored By
GoranSMilovanovic
May 21 2018, 9:13 AM
Referenced Files
F22416590: Fundraising_06172018.nb.html
Jun 21 2018, 12:00 AM
F22380701: CampaignEffect.png
Jun 19 2018, 11:47 PM
F22342123: image.png
Jun 18 2018, 3:28 PM
F22342121: image.png
Jun 18 2018, 3:28 PM
F22331034: Fundraising_06172018.nb.html
Jun 17 2018, 11:56 PM
Tokens
"Love" token, awarded by gabriel-wmde.

Description

Data analysis for the Donations/Membership data sets.

Reference doc: Documentation Export Donation Data

Event Timeline

@Jan_Dittrich and @GoranSMilovanovic notes:

--- The donations dataset:

https://phabricator.wikimedia.org/T194744

  • Goal: there are two pages: the "old" one, the "new" one
  • Goal: A/B test "old" vs. "new"
  • Dependent variable: donationsData$amount
  • Factor: in the donationsData$keyword, anything ending in "-ctrl" or "var"
  • The (re)coding schema for the donationsData$keyword factor is: https://phabricator.wikimedia.org/T194744#4207413
  • rescale to yearly data by donationsData$interval (see the Membership dataset);
  • NOTE: "0" value for the donationsData$interval would mean: once per year, just do that.
  • Another dependent variable to look at: donationsData$opt_in
  • DATA CLEANING: the donationsData$status:
  • B N X Z
  • 280802 37409 57334 63851
  • X == uncompleted - filter this out
  • B == booked, we got the money
  • N == booked, we got the money
  • Z = promise that they would send the money, filter this out

--- The membership dataset:

  • the same thing, the dependent variable is: membership_feee
  • Membership_fee_interval: The yearly fee can be paid once a year or every X month; the number in the data field signifies X
  • 1 = booking every month
  • 3 = booking every three months
  • 6 = booking every six months
  • 12 = booking every twelve months
  • rescale all the membership fee observations to a one-year period !
  • in the end, we get to analyze yearly data
  • Factor: the same as for the donations dataset
  • Donations data set is now clean, re-scaled to yearly amounts, and under analysis.
  • First results for the donations dataset, parametric tests (t-tests) used: 'old' vs 'new' comparison has no effect in any of the campaigns.
  • Moreover, TOST tests of equivalence (@Jan_Dittrich - thanks) show than no effects are significantly larger than the minimal assumed effects.
  • However, non-parametric tests must be conducted, given that (most probably) no distributions here have finite variances in the first place.

@kai.nissen @Jan_Dittrich @Tobi_WMDE_SW

  • We can conclude with certainty that there was no old page vs. new page effect in any of the campaigns in the Donations data set.
  • Thus far, I have conducted
    • (a) t-tests (independent and Welch),
    • (b) tested the power law behavior in the distributions of donation amount to see whether the sampling distributions can be assumed to be normal (75% of the time, the answer is yes, so t-tests should do fine),
    • (b) run the non-parameteric Mann-Whitney U test to confirm the findings, and
    • (d) repeated the t-tests and the Mann-Whitney tests after removing extreme outliers (> 3*IQR) from the tail of the distributions (i.e. very large donations).
    • (e) run the TOST equivalence tests, never observing any evidence for effects significantly larger than minimal.
  • Converging methodologies provide the same result, equivocally, for all campaigns: no effect.

Next steps:

  • the Membership data set,
  • and then I will get back to the opt_in variable in the Donations to see what happens there.
  • The final report, encompassing all analytical procedures, will be provided once all analyses are completed.

@kai.nissen @Jan_Dittrich @Tobi_WMDE_SW

  • The analysis of the Membership data set relied on non-parametric tests only (Mann-Whitney U test), because
  • we get to have really small sample sizes per campaign following the data clean-up
  • (in which case we are looking for a normal distribution in the samples themselves in order to ensure for the validity of t-tests, and I don't think we would have any luck in doing so).
  • The outcome of A/B tests for the Membership dataset: no effect in any of the campaigns.
  • N.B. Of course I have run the Welch t-tests, for no reason whatsoever because I am pretty sure about the nature of the data sets. And, as you can imagine... no effect.

@kai.nissen @Jan_Dittrich @Tobi_WMDE_SW

  • Finally, the analysis of the opt_in variable in the Donations data set:
  • binary logistic regressions were run, one model per campaign, with one categorical predictor (the campaign A/B, of course) and opt_in as a dependent variable;
  • only for mob05-ba-171218 we obtain no effect; for all other campaigns, the new page actually lowers the odds to opt in significantly.
  • Now reporting in detail; an Rmarkdown notebook will be shared by tomorrow evening.

@kai.nissen @Jan_Dittrich @Tobi_WMDE_SW

Here's a detailed technical report. Please let me know if you need any additional sections (I didn't bother to visual the differences between group means for t-tests as I find that trivial; if you wish to take a look at the distributions per campaign - just let me know).

So the outcome is: It is likely that there were not changes except for the opt-in.

As a reminder, this is how the old opt in looked like:

image.png (243×570 px, 19 KB)

and this is the new one:

image.png (302×459 px, 19 KB)

@Jan_Dittrich Yes, that would be the conclusion.

A question for you:

  • all my analyses were conducted cross-campaign, in a sense of analyzing each campaign's data separately;
  • but is it the case that the opt in page was always the same, in every campaign? Your posting of only two different solutions somehow implies that they were always the same.
  • Why is this important: because if the opt in pages (old/new) were always the same, then we need a campaign-wise analysis, in a sense of analyzing all data together, and not separately for each campaign.
  • BTW: the donations and membership pages, old/new... they were not the same across the campaigns, right? Usually a new campaign means there is something different happening than what was attempted previously?

If this needs a rework in terms of analyzing campaign-wise in the above given sense, we can have it in no time (it's the same R code with one filter removed per analysis).

is it the case that the opt in page was always the same, in every campaign? Your posting of only two different solutions somehow implies that they were always the same?

I think so: There were two opt-in pages (old design, new design) which were used with several different banners – correct @kai.nissen , @tmletzko ?

So, I I get it right, controlled variable would be new design/old design, outcome would be number of opt in/out and the different campagins would be a factor that varies, but is not itself of interest.

Hi @GoranSMilovanovic and @Jan_Dittrich. I've got some comments and questions on the analysis.

  • And then to your question: Throughout all campaigns there were only two types of landing pages, old vs. new; hence two types of opt-in forms: old vs. new.
  • However there were important differences between the campaigns: different banners leading to the landing page (i would say less important to consider) and different webpages / different devices where the banners were displayed. The campaigns were shown on de.wikipedia.org (desktop), en.wikipedia.org (desktop), wikipedia.de (desktop) and de.m.wikipedia (one campaign for all mobile devices except ipad and one campaign for ipad). So the big difference I would say are different devices e.g. screen sizes which are used.
  • And thats why we usually analyse each campaign for itself. Finally, the new landing page - which was especially designed for mobile experience - could perform better on mobile but worse on desktop. Still, it is very interesting to see that you couldn't find a difference in perfomance throughout all campaigns.
  • There is one thing about the tracking codes: As I wrote in https://phabricator.wikimedia.org/T194744 in campaign "38-ba-171223" the old landing page has a "var" tracking code. I just want to check if you accounted for that.
  • And one more general question. @Jan_Dittrich Besides the analysis of the performance of the old vs. new lp I thought the analysis would also focus on an explorative data analysis to find possible correlations in the data set. Did you talked about that, too?

Thank you for your comments!

  • "And then to your question: Throughout all campaigns there were only two types of landing pages, old vs. new; hence two types of opt-in forms: old vs. new."

That implies that we need also need a campaign-wise analysis (all the data from all campaigns pooled together) in place of the cross-campaign analysis (what we have now: each campaign analyzed separately). To be delivered this evening.

-" There is one thing about the tracking codes: As I wrote in https://phabricator.wikimedia.org/T194744 in campaign "38-ba-171223" the old landing page has a "var" tracking code. I just want to check if you accounted for that."

I think so, yes - the coding scheme was provided mentioning that exception and used in R directly to recode the data.

  • "Besides the analysis of the performance of the old vs. new lp I thought the analysis would also focus on an explorative data analysis to find possible correlations in the data set."

Please be a bit more specific about the "correlations in the data set" that you would like to investigate. The exploratory data analysis - in terms of looking at the distributions, cross-tabulations and similar - will be easy to complete.

Again, thank you for your comments and the additional info that you have provided.

@GoranSMilovanovic Ah, I thought the analysis was campaign-wise. Ok, looking forward to see the result.

Quoted Text Please be a bit more specific about the "correlations in the data set" that you would like to investigate.

I would like to ping here @Jan_Dittrich who suggested to do such an analysis. The background was to find clues for understanding why the opt-in rate and the rate of donations with addresses decreased with the new landing page, to further understand user behavior of the landing page use in general and to better understand factors influencing performance / success of a landing page.

@Tobias_Schumann_WMDE @Jan_Dittrich @kai.nissen

Donations Data Set

Campaign-wise analysis (i.e. all the data on donation amounts and opting in analyzed together, no matter the particular campaign from where did they originate):

  • independent t-test: experimental factor is old/new page, dependent variable is the donation amount - no effect.
  • Welch t-test (because of unequal sample sizes): experimental factor is old/new page, dependent variable is the donation amount - no effect.
  • Independent ANOVA (Analysis of Variance) w. (a) old/new page and (b) campaign as experimental factors, and the donation amount as dependent variable: (1) no main effect of old/new page, (2) significant main effect of campaign (illustrated), (c) no interaction.

CampaignEffect.png (432×700 px, 27 KB)

  • Binary Logistic Regression w. (a) old/new page and (b) campaign as experimental factors, and opt_in as dependent variable: (1) old page increases the probability to opt_in relative to the new page; (2) almost all campaigns that have brought about a higher donation amount on the average also have a higher probability to induce opt_in.

@Tobias_Schumann_WMDE @Jan_Dittrich @kai.nissen

Membership Data Set, campaign-wise analyses

  • Non-parametric test (Mann-Whitney U) implies no differences between the old and new page in membership fee.
  • ANOVA (independent 2x2 design, experimental factors new/old page and campaign): no effects reach statistical significance (i.e. no differences between new and old, and no significant effect of campaign; no interaction).

In other words, the results for both data sets do not differ if we switch from cross-campaign (i.e. every campaign analyzes separately) to campaign-wise (i.e. data from all campaigns are pooled together) analysis. No effect of old vs. new page. The only useful finding is the effect of campaign on the donation amount reported in T195242#4301115.

  • Visualizations/EDA pending, to be delivered.

@Tobias_Schumann_WMDE @Jan_Dittrich @kai.nissen

The Exploratory Data Analysis visualizations are found in the Exploratory Data Analysis sections for the Donations and the Membership data sets respectively:

I have focused on visualizing the distributions - per experimental old vs new factor, as well as per campaign - here. In my judgment, the EDAs are truly illustrative in respect to the results of the previously reported statistical hypothesis testings that have revealed no effects of experimental factors across the campaigns - as well as campaign-wise.

In my judgment, I think you should not focus on quantitative results beyond this point. Of course there are always some additional analyses than could be performed. However, after what I have seen thus far, I really do not think that anything beyond what we already have could provide any additional insight.

You have mentioned some "correlations" in the data sets that could be interesting to see. If you could be a bit more specific about what do you mean by "correlations" in the context of the present analysis, I could probably provide for them also. But I think that by comparing the boxplots - or simply by looking at the distributions of amounts (be it donations, or membership fees) across the campaigns - you will see why we have observed a consistent no effect finding.

I find the following comments of @Tobias_Schumann_WMDE very helpful in terms of guidance in analytical work:

However there were important differences between the campaigns: different banners leading to the landing page (i would say less important to consider) and different webpages / different devices where the banners were displayed. The campaigns were shown on de.wikipedia.org (desktop), en.wikipedia.org (desktop), wikipedia.de (desktop) and de.m.wikipedia (one campaign for all mobile devices except ipad and one campaign for ipad). So the big difference I would say are different devices e.g. screen sizes which are used [...] And thats why we usually analyse each campaign for itself. Finally, the new landing page - which was especially designed for mobile experience - could perform better on mobile but worse on desktop. Still, it is very interesting to see that you couldn't find a difference in perfomance throughout all campaigns.

but you know, as I know, that strict analytical procedures rely on observable data and well defined experimental designs - while the data sets that were provided encompass no information on the type of device. In that respect, having richer data sets in the future could maybe help us to understand the effects of our campaigns - or their failures - better. Having a failed campaign is not necessarily bad until we can learn something from it.

Please let me know if there are any questions that you might have in respect to what was provided thus far, or if you think that any of the reported findings call for additional interpretations.

Vvjjkkii renamed this task from Donations/Membership Data Analysis to xjcaaaaaaa.Jul 1 2018, 1:08 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed GoranSMilovanovic as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from xjcaaaaaaa to Donations/Membership Data Analysis.Jul 1 2018, 4:28 PM
CommunityTechBot closed this task as Resolved.
CommunityTechBot assigned this task to GoranSMilovanovic.
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.