Page MenuHomePhabricator

Daily Reporting: New Editors Thank You campaign 2019/2020
Closed, ResolvedPublic

Description

Reporting should start on January 1st 2020. The report must include any activity from the beginning of the campaign. The estimated start will be January 1st 2020.

We will reuse the landingpage we used in our last campaigns:

landingpage link
https://de.wikipedia.org/wiki/Wikipedia:Wikimedia_Deutschland/LerneWikipedia

campaign tag
There will be 2 banners with two different messages.
One message will adress getting new editors. Note: Only this banner is relevant for tracking.

The link to the New Editors Landing Page should contain the following campaign tag:
Banner: ?campaign=WMDE_2019_2020_thx (banner text to learn wikipedia)

Banner Name	Campaign Tag
WMDE_2019_2020_thx_dskt_ctrl	WMDE_2019_2020_thx
WMDE_2019_2020_thx_ipad_ctrl	WMDE_2019_2020_thx
WMDE_2019_2020_thx_mob_ctrl	WMDE_2019_2020_thx

landingpage including the campaign tag
Banner: https://de.wikipedia.org/wiki/Wikipedia:Wikimedia_Deutschland/LerneWikipedia?campaign=WMDE_2019_2020_thx

Campaign start: 01.01.2020

Note. Tracking ticket is: T240361

Event Timeline

GoranSMilovanovic renamed this task from Campaign report to Daily Reporting: New Editors Thank You campaign 2019/2020.Dec 10 2019, 10:47 PM
GoranSMilovanovic updated the task description. (Show Details)

Hey Goran, as agreed on in our call today, it would be great if you could post the daily numbers here starting jan 2nd, besides sending them via e-mail (just to keep all parties involved - namely @Verena @tmletzko @kai.nissen - on the same page). thanks!

@kai.nissen @Janina_Ottma_WMDE @Verena @tmletzko

The analytics code for this campaign is in place.

We can begin testing anytime. Please let me know when some test data are produced for:

a. campaign banners
b. campaign pageviews, and
c. campaign user registrations.

Thank you.

@kai.nissen @Janina_Ottma_WMDE @Verena @tmletzko

The Daily Reporting Spreadsheet for this campaign is ready.

It is shared with those of you whose emails I know, so please share the doc with any other colleagues who need to have access. Thanks.

Update for 2020/01/03 is ready.
No new user registrations on 3. January.

Update for 2020/01/04 is ready.
No new user registrations on 4. January.

Updates for 2020/01/05, 2020/01/06, and 2020/01/07 are now included.

Updates for 2020/01/07 and 2020/01/08 included, no new user registrations since January 5.
@Janina_Ottma_WMDE When does this campaign end?

Update for 2020/01/10 is in the Spreadsheet, one fresh user registration on January 10.
@Janina_Ottma_WMDE Please let me know until when do we run this campaign. Thank you!

@Janina_Ottma_WMDE Please let me know until when do we run this campaign. Thank you!

Thanks a lot for the latest updates! As far as I know, Fundraising planned to approximately run the campaign until 2020/01/20. Could you please confirm if this is still the plan, @tmletzko?

as you can see on Meta (Central notice) the campaign is set until 01-21. We had to take down the ipad campaign yesterday. Not sure if that one will reactivated again.

@Janina_Ottma_WMDE @tmletzko Thank you!

In the meantime, the 2020/01/12 update was produced. No new user registrations.

@Janina_Ottma_WMDE The 2020/01/13 update is ready, no new user registrations.

@Janina_Ottma_WMDE The 2020/01/14 and 2020/01/15 updates are ready, no new user registrations.

@Janina_Ottma_WMDE The 2020/01/16 update is complete; two new user registrations.

@Janina_Ottma_WMDE The 2020/01/17 update is complete; no new user registrations.

@Janina_Ottma_WMDE The 2020/01/18 and 2020/01/19 updates are ready; no new user registrations.

From T240361: the campaign got disabled yesterday 18:00 -> On to the Final Report now.

@Ragesoss @Janina_Ottma_WMDE

We have used the Training Modules for the WMDE Thank You 2019/2020 campaign and would need the data set.
Tha campaign was run between 1 and 20 January, 2020.

@Janina_Ottma_WMDE should be able to let us now what training modules were used (this info is not currently available to me; I've checked the campaign tracking ticket T240361 and the Campaign Concept Document). Please advise.

@Ragesoss When we learn what training modules were used in this campaign, I hope you will be able to share the data sets with me via email (goran.milovanovic_ext@wikimedia.de)? Thank you.

@Janina_Ottma_WMDE @kai.nissen

As of the banner expansion clicks and banner closing quotes, I've checked out the WMDEBannerEvents Schema as suggested in T240361#5744403; both available tables are empty, from stat1004:

mysql --defaults-file=/etc/mysql/conf.d/analytics-research-client.cnf -h analytics-slave.eqiad.wmnet -A
use log;
show tables like '%WMDEBannerEvents%';
+------------------------------------+
| Tables_in_log (%WMDEBannerEvents%) |
+------------------------------------+
| WMDEBannerEvents_18193948          |
| WMDEBannerEvents_18437830          |
+------------------------------------+
select count(*) from WMDEBannerEvents_18193948;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
select count(*) from WMDEBannerEvents_18437830;
+----------+
| count(*) |
+----------+
|        0 |
+----------+

Please advise.

@Ragesoss When we learn what training modules were used in this campaign, I hope you will be able to share the data sets with me via email (goran.milovanovic_ext@wikimedia.de)? Thank you.

Yep, give me a ping once you know the training modules and I will send you the data like before.

Hi @kai.nissen, could you give us an info on that, please?

@Janina_Ottma_WMDE should be able to let us now what training modules were used (this info is not currently available to me; I've checked the campaign tracking ticket T240361 and the Campaign Concept Document). Please advise.

Hello @Ragesoss, thanks for providing the data on how many people began and completed the following training modules:

  1. Editiergrundlagen: https://outreachdashboard.wmflabs.org/training/wikipedia-editieren/editieren-basiswissen
  2. Wikipedia Grundlagen: https://outreachdashboard.wmflabs.org/training/wikipedia-editieren/wikipedia-basiswissen
  3. Artikel und Quellen bewerten: https://outreachdashboard.wmflabs.org/training/wikipedia-editieren/artikel-bewerten
  4. Quellcode und Diskutieren in der Wikipedia: https://outreachdashboard.wmflabs.org/training/wikipedia-editieren/diskutieren-basiswissen

Is is possible to get the stats for both logged in and non-logged in users?

thanks a lot!

@GoranSMilovanovic @Janina_Ottma_WMDE
The data can be retrieved from stat1007 using Hive, e. g.:

hive -e 'USE event; SELECT day, event.bannerName, event.bannerAction, COUNT(*) FROM wmdebannerevents WHERE year = 2020 AND month = 1 AND (event.bannerName LIKE "WMDE%" OR event.bannerName LIKE "WPDE%") GROUP BY day, event.bannerName, event.bannerAction;' | sed 's/[\t]/,/g' > thankyou_2020.csv

BUT: Apparently, we had a mixup of campaign tags when redeploying the banners on Jan 2. The process of storing the campaign tag in a cookie and retrieving it during registering an account broke, because the tag did not start with "WMDE_". Taking this into account, we should be able to determine the number of banner clicks/page visitors on Wikipedia:LerneWikipedia, but we do not know the number of registrations that followed a desktop banner click.

@kai.nissen @GoranSMilovanovic sorry, but this is not clear to me at this point - is there any way to restore the missing data?

@Janina_Ottma_WMDE from what I see in the data I can confirm what @kai.nissen is saying in the following way: the only registrations that we have from desktop have occurred on the first day of the campaign. I will re-check the data sets but I am pretty sure that this is the case. I should also be able to deliver the Campaign Report tonight, and then we can discuss it whenever you are ready.

@Janina_Ottma_WMDE The data can not be restored, because there never was data. Mixing up the campaign tags for the banner on Jan 2nd resulted in the value not being stored in a cookie and thus being lost when people registered an account. We stopped "Adam's Hack" from working, because it relies on the prefix "WMDE_" which was changed to "WPDE_".

@Janina_Ottma_WMDE @kai.nissen

Because of the WMDE_ vs. WPDE_ thing, I will have to re-run all data collection procedures.
Reporting back ASAP. @Janina_Ottma_WMDE The Campaign Report will be delivered tomorrow - re-running data collection will take some time.

@GoranSMilovanovic thank you for the update. just to confirm: https://docs.google.com/spreadsheets/d/1ibDHIS0v5kNyjlJEY2DmoylmG4l2VTZxBRkCWF6u1-I/edit?ts=5e0e5b44#gid=0 is the finished report? where do I find the data concerning the training modules?

@Janina_Ottma_WMDE Here is the Interim Campaign Report for the WMDE Thank You 2020 Campaign.

It "interim" because I would like to discuss the Report with you and see if there is anything unclear or if anything needs to be added before we call it "final".
In general, you are interested in what happens in Section 1 and onward. Section 0 is just code for data collection and aggregation.

@Janina_Ottma_WMDE: sorry - I forgot to submit the report file here ^^ T240351#5833762.
It is an R Markdown Notebook - essentially and html file - with all the data, cross-tabulations and visualizations included. Just download it and open in your browser.
We use the Google Spreadsheets only for daily reporting during the campaign.
When you read the Report, please get back in touch with any questions that you might have. The A/B tests are not included (thus: Interim Report), for example, because many are possible so we need to discuss exactly what tests do we want to run (i.e. what tests make sense).
It would be the best to have a 1:1 Google Hangouts session to discuss the Report, especially if this is your first WMDE Banner Campaign. Thanks!

no problem, @GoranSMilovanovic - I already wondered where the promised diagrams went ;) Let me take a look at this today so I can back to you with all open questions. will schedule a call this week.

@Janina_Ottma_WMDE @WMDE-leszek

As the time passes by I am getting more and more focused on other projects and some day it might be difficult (and more time consuming) for me to get back into this campaign and provide additional analysis.
@Janina_Ottma_WMDE - what are the chances to meet and discuss what else needs to be done in the near future?

@GoranSMilovanovic Hi Goran, Janina is out of office until the end of February. Does it have time until then? Otherwise: Could you quickly give me an update or link on what the open questions are?

@Christine_Domgoergen_WMDE The end of February is fine, thank you for getting back to me on this one.

@GoranSMilovanovic
Hi Goran,
I will take over the wrap-up of the campaign from Janina. I am checking all the documents and the final report and trying to understand all the details, but some questions have come up. Could you help me on those? Thank you!

  1. in the report I see two banner names, one ending on _ctrl as defined in this task here and one ending on _var --> what is the difference, where does the second banner come from, do you know? I can find nothing about this in the tracking concept
  2. in the concept I found the information that there are two banners, one to attract new members (responsibility of fundraising department) and one to attract new editors (our responsibility). As I understood the banner impressions in the report only refer to our banner (tagged with "thx"), correct?
  3. I need to calculate the closing and extension rate of the banner from the banner actions table - do you have the data from 1.14 in a spreadsheet?
  4. user edits 3.1. do I understand correctly: in total four users made edits, one user in each edit class?
  5. user edits: until which date did you check the user edits after the end of the campaign?
  6. the report ist still named interim report - do you still wait on more data or is there another reason for the naming?

Thank you!

Edit: Found the answer for number 1. :-)

@GoranSMilovanovic
Hi Goran,
one other question has come up about the page views. I was wondering if the numbers in the spreadsheet are correct? Because for example on some days, there are no page views from desktop (e.g. 5th of January) and this does not seem very probable and also the report states otherwise, if I see the numbers correctly. Could you update the numbers in the spreadsheet as well? I need them to calculate the results from the A/B-Test.... thank you!

@Christine_Domgoergen_WMDE @WMDE-leszek

I am currently working on T239200, T239199, and there are things that have a set deadline for tomorrow (e.g. updating the Wikidata Languages Landscape system). As soon as I can switch focus to this campaign I will get back to you.

Hi @Christine_Domgoergen_WMDE

in respect to T240351#5911537:

in the concept I found the information that there are two banners, one to attract new members (responsibility of fundraising department) and one to attract new editors (our responsibility). As I understood the banner impressions in the report only refer to our banner (tagged with "thx"), correct?

Yes.

I need to calculate the closing and extension rate of the banner from the banner actions table - do you have the data from 1.14 in a spreadsheet?

I will deliver the spreadsheet during the day, however, I can calculate the rates for you if you prefer.

user edits 3.1. do I understand correctly: in total four users made edits, one user in each edit class?

Yes.

user edits: until which date did you check the user edits after the end of the campaign?

The user edits were considered beginning from 2020/01/01 and until the time the interim report was produced: 2020/01/27 (see: T240351#5833762).

the report ist still named interim report - do you still wait on more data or is there another reason for the naming?

It is still named "interim" because I have waited to hear from @Janina_Ottma_WMDE if there is anything else to be done here. We were about to discuss what A/B tests are needed, for example.

@Christine_Domgoergen_WMDE In respect to T240351#5919056:

I was wondering if the numbers in the spreadsheet are correct? Because for example on some days, there are no page views from desktop (e.g. 5th of January) and this does not seem very probable and also the report states otherwise, if I see the numbers correctly. Could you update the numbers in the spreadsheet as well? I need them to calculate the results from the A/B-Test.... thank you!

Could be. I will check-out during the day and then get back to you. The spreadsheets are always used for daily reporting during the campaign; what is in the report should be taken as the de facto information on the campaign. However if the spreadsheets are what you need, no problem, just give me some time to generate them for you.

As of the banners, and since you were not involved in this campaign or at least not that I know, please also take into your consideration this: T240351#5820231

Getting back to you soon.

@Christine_Domgoergen_WMDE

I need to calculate the closing and extension rate of the banner from the banner actions table - do you have the data from 1.14 in a spreadsheet?

Here it goes:

@Christine_Domgoergen_WMDE Also, I have noticed that some charts are not properly sorted on the horizontal axis representing date.

Here is an updated version of the report with the charts properly sorted. No modifications to the data sets were made.

@GoranSMilovanovic

Great, thank you for the updates.

I will deliver the spreadsheet during the day, however, I can calculate the rates for you if you prefer.

Could you really do this? This would be very helpful indeed.

Could be. I will check-out during the day and then get back to you. The spreadsheets are always used for daily reporting during the campaign; what is in the report should be taken as the de facto information on the campaign. However if the spreadsheets are what you need, no problem, just give me some time to generate them for you.

Perfect.

@Christine_Domgoergen_WMDE

In the updated version of the report (shared here), the banner actions rates are included in section 1. 1. 4. immediately following the full data set for banner actions.

@Christine_Domgoergen_WMDE

The reporting spreadsheets are now made tidier and are certainly fully consistent with the data in the Report.

@GoranSMilovanovic
Great, thank you!

In the updated version of the report (shared here), the banner actions rates are included in section 1. 1. 4. immediately following the full data set for banner actions.

Two questions: 1. I assume the closing rate is not a percentage yet and I need to get the percentage from this, correct? This would be 58,7% and 27.1%, right? 2. And I was wondering if you could also calculate quickly the closing and expansion rate by platform and banner version from the numbers in the charts in 1.1.4 and 1.1.5? This would be great.

Also I now read about the mixing up of the campaign tags. Please let me know if I understand correctly: we have no data on user registrations from desktop only from January 2nd, correct? so there might be more registrations from destop after day one but we do not know about them. This does not concern mobile or ipad or the tracking of page views.

Concerning Chart 1.2.3. Pageviews Overview: totals per Tag/Page: What is the difference between the tags with _link_ and without? Can I just sum them up for the calculation of conversion rates impressions --> page views for the different banner versions?

An addition concerning Chart 1.2.3.: the tag thx_link_dskt_ctrl is missing, is there a reason? And when I sum up the numbers 2 page views are missing, I get a total of 2213 instead of 2215 - maybe there is a connection? Thank you for checking!

@GoranSMilovanovic

Hi Goran,
I got to the calculation of the numbers for the A/B-Test and have three more questions:

  1. can we see how many of the registrations came from which banner version?
  2. can we see how from which banner version the users with edits came?
  3. can we see how the closing rate after the extension of banner varies between the two?

@Christine_Domgoergen_WMDE

As of T240351#5936973:

Two questions: 1. I assume the closing rate is not a percentage yet and I need to get the percentage from this, correct? This would be 58,7% and 27.1%, right?

The rates are now expressed as percents (see the updated Report below).

  1. And I was wondering if you could also calculate quickly the closing and expansion rate by platform and banner version from the numbers in the charts in 1.1.4 and 1.1.5? This would be great.

Of course. You will find the banner action rates now included in tables immediately below the charts 1.1.4 and 1.1.5 respectively.

> Also I now read about the mixing up of the campaign tags. Please let me know if I understand correctly: we have no data on user registrations from desktop only from January 2nd, correct? so there might be more registrations from destop after day one but we do not know about them. This does not concern mobile or ipad or the tracking of page views.
Well, according to what @kai.nissen says in T240351#5820231

Taking this into account, we should be able to determine the number of banner clicks/page visitors on Wikipedia:LerneWikipedia, but we do not know the number of registrations that followed a desktop banner click.

  • I would say that you have built a correct understanding. I do not know when this January 2nd bug was fixed - but @kai.nissen might now.

Concerning Chart 1.2.3. Pageviews Overview: totals per Tag/Page: What is the difference between the tags with _link_ and without? Can I just sum them up for the calculation of conversion rates impressions --> page views for the different banner versions?

If I understand the concept of this Campaign correctly, there were "two places" for users to click: (a) the banner itself, or (b) some link inside the banner. Please check with your campaign team they should be able to explain more precisely.

As of T240351#5937010:

An addition concerning Chart 1.2.3.: the tag thx_link_dskt_ctrl is missing, is there a reason? And when I sum up the numbers 2 page views are missing, I get a total of 2213 instead of 2215 - maybe there is a connection? Thank you for checking!

You are correct in your observation that we have no observations of pageviews from thx_link_dskt_ctrl, but we do have from WPDE_thx_link_dskt_ctrl - which is, in my understanding, the "wrong" campaign tag (see: T240351#5826946) used on January 2nd.

And when I sum up the numbers 2 page views are missing, I get a total of 2213 instead of 2215 - maybe there is a connection? Thank you for checking!

I am not sure that the two are connected in any way, because when I sum the numbers (Pageviews tab in the spreadsheet) I get 2215. As of the Page Views per Tag tab in the spreadsheet, not created by me: please check your numbers carefully, make sure that you have entered all the numbers correctly, and if it still does not sum up to 2215 then I will have to check what happened in the code. But it would be very strange that the numbers do not sum up correctly - the per tag pageviews data are obtained from the same data set as the overall pageviews reported.

Updated report:

@Christine_Domgoergen_WMDE

As of T240351#5937402:

I got to the calculation of the numbers for the A/B-Test and have three more questions:

Let me remind you that we have once standardized the A/B tests from Bayesian procedures for your team that are in accord what is used in WMF - that would be the Autumn Banner Campaign 2018 if I remember correctly. I hope the A/B tests that you have obtained for this campaign work the same way - some day someone might wish to compare the effectiveness of our campaigns over time.

can we see how many of the registrations came from which banner version?

That would be Chart 2.1 Registrations per tag and day - as well as the table immediately following that chart - in the Report.

can we see how from which banner version the users with edits came?

This is now included under 3.1 User edits: distribution - the second table in this section.

can we see how the closing rate after the extension of banner varies between the two?

I am not sure if I understand the question in the way it is formulated right now. What exactly would you like to learn?

Updated report:

@tmletzko
Huhu Till,
kannst du uns hier weiterhelfen? Bezeichnet der Tag mit _link_ den Link im Text oder den Link im Button? (z. B. thx_dskt_ctrl vs. thx_link_dskt_ctrl)

Concerning Chart 1.2.3. Pageviews Overview: totals per Tag/Page: What is the difference between the tags with _link_ and without? Can I just sum them up for the calculation of conversion rates impressions --> page views for the different banner versions?

If I understand the concept of this Campaign correctly, there were "two places" for users to click: (a) the banner itself, or (b) some link inside the banner. Please check with your campaign team they should be able to explain more precisely.

@GoranSMilovanovic
Hi Goran,
thank you! This is all superhelpful.

I am not sure that the two are connected in any way, because when I sum the numbers (Pageviews tab in the spreadsheet) I get 2215. As of the Page Views per Tag tab in the spreadsheet, not created by me: please check your numbers carefully, make sure that you have entered all the numbers correctly, and if it still does not sum up to 2215 then I will have to check what happened in the code. But it would be very strange that the numbers do not sum up correctly - the per tag pageviews data are obtained from the same data set as the overall pageviews reported.

Yes in the spreadsheet it is 2215 but when I sum up the numbers from Chart 1.2.3 I only get 2213 (I added and extra tab in the spreadsheet here).

Let me remind you that we have once standardized the A/B tests from Bayesian procedures for your team that are in accord what is used in WMF - that would be the Autumn Banner Campaign 2018 if I remember correctly. I hope the A/B tests that you have obtained for this campaign work the same way - some day someone might wish to compare the effectiveness of our campaigns over time.

What do you mean by that? Is there documentation about the standards, do you know?

I am not sure if I understand the question in the way it is formulated right now. What exactly would you like to learn?

I would like to know, what the closing rate of the different banner versions (ctrl vs. var) is - after the banner was extended. So after the users saw the different banner texts, did they react differently in regard to closing the banner? Is it even possible to do a cross analysis of that? Please do not invest a lot of time in this, if it is a lot of work we won't need it :-)

@Christine_Domgoergen_WMDE

What do you mean by that? Is there documentation about the standards, do you know?

No documentation - we had only one campaign where A/B tests were requested. However, they were performed in R following my analysis of what WMF does in Python, and I took the same (very much standard) statistical approach in order to keep us aligned (Bayesian A/B tests from the Binomial Distribution).

However, here is the report for that one campaign - and that would be Autumn 2017 - with A/B tests on user registrations and user edits, if you wish to take a look:

I would like to know, what the closing rate of the different banner versions (ctrl vs. var) is - after the banner was extended.

Let me check if I understand correctly: the banner can be closed only once after it is extended, or not? This is important because if the answer is "yes", then our data sets can be used to extract the data for the "cross analysis" that you need, but if the answer is "no" then our data cannot be used for this type of analysis (because then we cannot establish a temporal relation: expand banner -> close_banner from the data set).

@GoranSMilovanovic

Let me check if I understand correctly: the banner can be closed only once after it is extended, or not? This is important because if the answer is "yes", then our data sets can be used to extract the data for the "cross analysis" that you need, but if the answer is "no" then our data cannot be used for this type of analysis (because then we cannot establish a temporal relation: expand banner -> close_banner from the data set).

No, the banner can be closed right away or after it is extended. See screenshot

Okay, then there is no task for you :-)

Thank you for the report, I will have a look at it and get back to you, if I have further questions.

@GoranSMilovanovic
Hi Goran,
I know it's just a little thing, but could you have a quick look at this:

Yes in the spreadsheet it is 2215 but when I sum up the numbers from Chart 1.2.3 I only get 2213 (I added and extra tab in the spreadsheet here).

@Christine_Domgoergen_WMDE Found it, there were two pageviews marked by the following tags:

?&piwik_campaign=WMDE_2019_2020_thx_dskt_btn_ctrl&piwik_kwd=ty01-ba-200101
?&piwik_campaign=WMDE_2019_2020_thx_dskt_btn_var&piwik_kwd=ty01-ba-200101

These two were counted in the pageviews per day data set (the one that sums up to 2215), but not in the per tag/per page data set (the one that sums up to 2213).
The reason why they were not counted in the later data set: I have assumed that we don't want any pageviews counted from Piwik - because its probably us, so I've filtered them out:

dataSet <- filter(dataSet, 
                  !(grepl("?&piwik_", dataSet$Tag)))

These two pageviews were observed on 2020/01/02, so the pageviews per day data set needs to be corrected in the spreadsheet: done.

@GoranSMilovanovic
Perfect, thank you for the information! There is nothing else needed, so I will close the task now :-)