Page MenuHomePhabricator

wmit-* account creation campaigns totals
Closed, ResolvedPublic

Description

WMIT has been testing some campaigns which send people to URLs like https://it.wikipedia.org/w/index.php?title=Speciale:Entra&returnto=Progetto%3ABiografia%2FDonne%2FScienza&type=signup&campaign=wmit-scienziate

As far as I understand, all the signups with a &campaign= parameter get logged, so it would be nice if we could get totals of registrations for all the wmit-* campaigns between September 2015 and December 2015.

The numbers are small: around 27k clicks and probably a maximum of 2k actual signups according to some control statistics. We'd like to have the true number to see how realistic our estimates are and for inclusion in our mid-year report.

Event Timeline

FedericoLeva-WMIT raised the priority of this task from to Needs Triage.
FedericoLeva-WMIT updated the task description. (Show Details)
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptJan 7 2016, 1:14 PM
Milimetric closed this task as Declined.Jan 28 2016, 6:24 PM
Milimetric claimed this task.
Milimetric added a subscriber: Milimetric.

We don't keep the raw data that far back. If you'd like to do this sort of query, you'd have to have access to the cluster and be ready to start running it before we delete the data.

We don't keep the raw data that far back. If you'd like to do this sort of query, you'd have to have access to the cluster and be ready to start running it before we delete the data.

Thanks for the answer. What do you mean by "raw data"? Is the ServerSideAccountCreation table purged regularly in a way that makes counting impossible? https://meta.wikimedia.org/wiki/Schema_talk:ServerSideAccountCreation says

Auto-purge just eventCapsule PII after 90 days, keep the rest indefinitely

which I'm not sure how to interpret.

We don't keep the raw data that far back. If you'd like to do this sort of query, you'd have to have access to the cluster and be ready to start running it before we delete the data.

Thanks for the answer. What do you mean by "raw data"? Is the ServerSideAccountCreation table purged regularly in a way that makes counting impossible? https://meta.wikimedia.org/wiki/Schema_talk:ServerSideAccountCreation says

Oh, I had forgotten about ServerSideAccountCreation. It is purged but it probably still has the data you're looking for. Do you have access to the analytics-store database? People usually log into it from stat1003. For info on that, see - https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Access_to_WMF_machines

Auto-purge just eventCapsule PII after 90 days, keep the rest indefinitely

which I'm not sure how to interpret.

The event capsule consists of client IPs, user agents, basically sensitive data that we don't want to keep for a long period of time. So after 90 days, it's purged. But the rest of the data is kept and it seemed to me that's all you needed for your report. If you've never run a query against Event Logging data, I can help - just ping me in IRC after you have the proper access.

Ironholds set Security to None.Feb 2 2016, 4:09 PM
Ironholds removed a subscriber: Ironholds.

It's not currently planned for me to ask access to this data. We'd just need someone to run a SELECT COUNT(*) with the specified conditions, is that difficult?

mmm, I wouldn't normally do it, not my job kind of thing, because usually it turns into more than a simple select. But since you asked nicely ;)

mysql:research@analytics-store.eqiad.wmnet [log]>

 select event_campaign, count(*)
   from ServerSideAccountCreation_5487345
  where event_campaign like 'wmit-%'
    and timestamp between '20150901000000' and '20160101000000'
  group by event_campaign;

+----------------+----------+
| event_campaign | count(*) |
+----------------+----------+
| wmit-anpi      |        3 |
| wmit-facebook  |       22 |
+----------------+----------+
2 rows in set (10.64 sec)
Milimetric changed the task status from Declined to Resolved.Feb 2 2016, 4:39 PM
JAllemandou moved this task from Next Up to Done on the Analytics-Kanban board.Feb 8 2016, 6:41 PM