Page MenuHomePhabricator

Thank-You-Campaign 2018: Tracking for Training-Modules possible? How?
Closed, ResolvedPublic

Description

@GoranSMilovanovic @Addshore

Hi guys,

we plan to create german Wikipedia tutorials, that already exist in English. Namely I mean the 3 modules (Basics) in that dashboard: https://outreachdashboard.wmflabs.org/training/editing-wikipedia

We would like to track the following:

  • Who did whitch module (user id)
  • Exitpoints of modules per user id
  • Timeframe in which the users completed the steps

Could you check, if and how this is possible?

For a better understanding you can find the user journey that we created yesterday:

User Journey.jpg (720×960 px, 64 KB)

Here you can also find more technical information on how to create training modules. Maybe that can help you: https://meta.wikimedia.org/wiki/Training_modules/Documentation

Event Timeline

It might first be better to contact the maintainers https://meta.wikimedia.org/wiki/Programs_%26_Events_Dashboard#Technical_development
It would likely take us lots of time to find the code, dig through the code, learn how it works etc when the functionality you ask for might already exist.

@Addshore

Thank you for your quick reply. Tobi told me, that your resources for that task are not enogh. So I just unsubscribe you from the task.

@Stefan_Schneider_WMDE Let me take a closer look at the task quickly, and I'll let you know what I think ASAP.

@Stefan_Schneider_WMDE It essentially turns down to what @Addshore has already noted above:

  • then I see no problem to deliver the data and provide any useful analytics in relation to this that we might think of.

However, we do not know much about the https://outreachdashboard.wmflabs.org/ system, for example do they implement any tracking at all.
The best would be to get in touch with them and ask a simple question:

  • do you collect any data on the registered users who complete the tutorial, and if yes, where do you store the data and how can we access them?**

Hi @Ragesoss

maybe you can help us here. As described in T181305#3788308

do you collect any data on the registered users who complete the tutorial, and if yes, where do you store the data and how can we access them?

Could you give us more information on that and if there are tables where this data is stored and if so how we can access the data?
That maybe also relates to the TrainingModulesUsers-table you mentioned in the Opt-In Issue.

@Stefan_Schneider_WMDE: yes, that data of logged-in users who complete (or partially complete) training modules is stored in the TrainingModulesUsers table. Currently, there's no way to access it for an arbitrary user, so such a feature would need to be added to the Dashboard codebase. I've created an issue to track it here: https://github.com/WikiEducationFoundation/WikiEduDashboard/issues/1530

Implementing a JSON api to provide that data would be pretty easy for a dev who knows some Ruby, if you just want it in machine-readable form.

@Ragesoss: Thank you for the quick help. Could you provide more information about the TrainingModulesUsers-table? @GoranSMilovanovic is our Data Analyst on this topic and maybe he could create his own ways to work wich this data. What would neet to happen to have access to the data? Where can we find this table?

@GoranSMilovanovic: What else do you need?

@Stefan_Schneider_WMDE at the moment, to learn where does the TrainingModulesUsers table live. It's not listed as a Research Schema (or not yet): https://meta.wikimedia.org/wiki/Research:Schemas

That table is not in a publicly-accessible wmflabs database. It's part of the database of the app itself (running on outreachdashboard.wmflabs.org but only accessible by the app).

@Ragesoss Does my NDA with Wikimedia helps in that respect?

Is the resource accessible from wmflabs i.e. shall I assume that it's found on tools.labsdb?

If both answers are yes, I guess it would be polite to ask if your team agrees that we use that table for analytic purposes in WMDE?

It's not accessible except from the toolforge server it runs on. It's not on tools.labsdb, it's just a mysql database on a single toolforge cloud instance.

@Ragesoss In other words, we cannot access it this or the other way.

May I ask then - if my colleagues at the WMDE start using the Programs & Events Dashboard in their campaigns - your team to export some particular subsets from that database for campaign evaluation purposes? I would be responsible for analytics and liaison with your team in that matter.

@Stefan_Schneider_WMDE

@GoranSMilovanovic sure, I think I can get that data for you without much trouble.

@Ragesoss Thanks a lot.

@Stefan_Schneider_WMDE When the time comes, and before the campaigning begins, please let me know what do we want to learn from the Training Modules, so that I can get in touch with them and see how do they schemata look like and what data do they store.

@GoranSMilovanovic: The following information would be of use:

  • Who did whitch module (user id)
  • Exitpoints of modules per user id
  • Timeframe in which the users completed the steps

@Stefan_Schneider_WMDE Ok, so this is similar to what we do with Guided Tours then.

@Ragesoss Would it be possible to obtain the following data from your databases when our campaign starts running:

  • Who did whitch module (user id);
  • exitpoints of modules per user id;
  • timeframe in which the users completed the steps?

@GoranSMilovanovic: I can provide data like this:

username, training_module, last_slide_completed, module_completion_date
Stefan Schneider (WMDE), editieren-basiswissen, das-wars, 2017-12-07 09:58:19

We don't currently store any additional timestamp info beyond that.

@Ragesoss Well, we can have what we can get, I guess. This sounds fine to me in respect to what @Stefan_Schneider_WMDE has asked for. Stay in touch.

@GoranSMilovanovic @Ragesoss
That sounds awesome! Could Goran have a sample beforehand, so he can test it before? It would be very helpful to know that everything works before we start to communicate outcomes.
Thanks in advance!

@GoranSMilovanovic: I would like to test the data from the TrainingModulesUsers-table beforhand. This week would be awesome. What do you think about that?

@Stefan_Schneider_WMDE I agree.

@Ragesoss Could it be possible to obtain a sample from your table so that we can prepare the R code for our campaign analytics? Thanks a lot!

@GoranSMilovanovic here's the data I just pulled from it.

And the script I used:

module_ids = [40001, 40002, 40003]


csv_data = [['username', 'training_module', 'last_slide_completed', 'module_completion_date']]

module_ids.each do |m_id|
  tm = TrainingModule.find(m_id)
  tmus = TrainingModulesUsers.where(training_module_id: m_id)
  tmus.each do |tmu|
    csv_data << [tmu.user.username, tm.slug, tmu.last_slide_completed, tmu.completed_at]
  end
end

CSV.open('/home/ragesoss/wmde_training_data.csv', 'wb') do |csv|
  csv_data.each do |line|
    csv << line
  end
end

@Ragesoss Got it. Thanks a lot. I will keep in touch in respect to the timing of the csv exports when our campaign starts. Given that you will be my data holder in this case, I will do whatever I can to keep you unoccupied with this.

@GoranSMilovanovic: Thx for your support! I guess everything is working :) That's awesome!

@Stefan_Schneider_WMDE I hope you've browsed the sample data... From my viewpoint, it seems to be exactly what we need.

@Ragesoss @Stefan_Schneider_WMDE

One important thing: the data set that you have shared here encompasses user names. I don't think we should do that on Phabricator, however, I guess no problems are implied here simply because you have shared test data only.

@Ragesoss I will ask for data export from your training modules application database only once, at the end of the campaign. That will probably take place on January 2nd, but I still need to learn from my team on the exact beginning and ending dates of the campaign (it turns out these do not depend solely upon their decisions). I will need the user names there in order to match against the Guided Tour schemes, because two Guided Tours will be offered to our users from the training module. I have an NDA signed and I suggest that you send the data to me via e-mail: goran.milovanovic_ext@wikimedia.de. If the data set turns out to be too large for e-mail, we will figure out where from could I pick it up later on.

Thanks. Stay in touch.

So, if I understand correctly we won't be doing any actual guided tours this time?

Yes, there will be guided tours (see T182797) but they are included in the training modules and they are not suppposed to start automatically but will be called by users by clicking on a link.

@Ragesoss Could I please ask for a data export on 15. January from your Training Modules database? We need the data from the beginning of 2018 and until 15. January. I have an NDA signed and I advise you send the file (in case it is not too large) to goran.milovanovic_ext@wikimedia.de; otherwise, let me know what other ways of delivery could you make use of.

The three values of the slug field that you can use to filter out the data for us at WMDE are:

  • wikipedia-basiswissen,
  • artikel-bewerten, and
  • editieren-basiswissen @Stefan_Schneider_WMDE This choice is based on the following source page; please confirm that this is what we need.

Thank you!

I've emailed the data from P&E Dashboard to @GoranSMilovanovic .