|Open||Amire80||T78351 there should be a comparison of clicks count on interlanguage links on different platforms|
|Resolved||Amire80||T132425 Prepare infrastructure to measure interlanguage links clicks on desktop|
|Resolved||Amire80||T122479 schedule a daily run of ContentTranslation analytics scripts|
|Resolved||MoritzMuehlenhoff||T122524 Add amire80 to analytics-privatedata-users group|
- Mentioned In
- rOPUPfa13bc765e57: Add amire80 to analytics-privatedata-users group
T135704: Add kartik to analytics-privatedata-users
T127808: access for nikerabbit to researchers
T122479: schedule a daily run of ContentTranslation analytics scripts
- Mentioned Here
- T122479: schedule a daily run of ContentTranslation analytics scripts
Most of these will be for the wikishared database. When I run them on terbium, they take a few minutes.
One of them is to the databases of each Wikipedia, checking which articles that had the contentranslation revision tag were deleted a day before. On terbium this takes about five minutes for all the languages.
You can see the shell scripts here: https://phabricator.wikimedia.org/diffusion/ECTX/browse/master/scripts/daily-stats/
There are 3 things that you need to get access:
- a signed NDA, that you should already have, otherwise you wouldn't have terbium access,
- the ok with a comment here from your direct supervisor
- the ok from the service owner, which probably would be @Nuria (for server access) and myself (for database access).
After that, access will be reviewed and, usually, accepted.
The access is now in review, a minimum of 3 days is required for security review. That would usually mean getting a decision by 5th of January, but I apologize on behalf of the team if there is any delay, as these days many ops will be traveling or on vacations.
User @Amire80 has been added to the group "statistics-users" now, since i merged the pending patch to do that.
But the server stat1002 does not have that user.?
id: amire80: no such user
It seems other groups are requested that give access to stat1002 then. Do you know which?
P.S. Trying to figure out which make sense. Looking at https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Access_Groups now (thanks Robh for the link)
amire80 has been added to the group statistics-users on stat1003
[stat1003:~] $ id amire80
uid=2076(amire80) gid=500(wikidev) groups=500(wikidev),726(statistics-users)
and the docs say:
Access to stat1003 for number crunching and connecting to the SQL research slaves.
Is "connecting to SQL research slaves" what you want? Then this is ticket is resolved and the title should just be stat1003 instead of stat1002 and you' d use that.
Or is it really stat1002 and one of the other groups described on https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Access_Groups?
@jcrespo I guess it depends on the mysql grants you said you have to add anyways?
@Ottomata Was statistics-users and stat1003 correct for this use case?
Please clarify how you will run these queries. If MySQL, then you only need access to the 'researchers' group, which will get you access to stat1003 and the research user password.
If you need access to Hive/Hadoop and private webrequest data, then you'll need to be in the analytics-privatedata-users group. This will get you access to stat1002 and stat1004.
@Amire80 needs to run hive queries to count the number times users navigate across wikipedia language projects.
Something like this:
SELECT hour, day, month, year, REGEXP_EXTRACT(parse_url(referer,'HOST'), '([a-z]*)(.m)?.wikipedia.org', 1) AS prev, normalized_host.project AS curr FROM wmf.webrequest WHERE -- select a relevant timespan to query over year = 2016 AND month IN (5) AND day IN (1) AND hour IN (1,2) -- only consider wikipedia article requests from users AND webrequest_source = 'text' AND is_pageview AND agent_type = 'user' AND normalized_host.project_class = 'wikipedia' -- only consider wikipedia article referers (this is an approximation) AND parse_url(referer,'HOST') RLIKE 'wikipedia.org' AND parse_url(referer,'PATH') RLIKE '^/wiki/'