Page MenuHomePhabricator

Document how to rerun map stats
Closed, ResolvedPublic

Description

https://people.wikimedia.org/~bearloga/reports/maps-usage.html is a year old, and we'd like updated stats. I'd be happy to do the work myself, if I have instructions to go on. I'm very familiar with SQL and the MW DB schema, but not at all with Python, R and their universes.

Event Timeline

mpopov moved this task from Backlog to Next Up on the Product-Analytics board.
mpopov moved this task from Next Up to Doing on the Product-Analytics board.

@Catrope: I updated the repository with instructions: https://github.com/wikimedia-research/Discovery-Interactive-Adhoc-Usage#re-run-instructions

Please let me know if anything is not clear or if you run into any issues. Hopefully everything goes well! Usage report might take 1-2 hours (it computes latest stats directly from the databases) so only make that one if you can maintain an SSH connection to the analytics cluster for a while.

This way you have finer control over which wikis' dbs are queried, but at the expense of how long it takes to output the report. Feel free to file an issue on that GH repo to switch to the prevalence datasets I mentioned in T193694#4180303.

I successfully regenerated the stats this way, thanks!

Vvjjkkii renamed this task from Document how to rerun map stats to pndaaaaaaa.Jul 1 2018, 1:12 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed mpopov as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii added a subscriber: mpopov.
CommunityTechBot renamed this task from pndaaaaaaa to Document how to rerun map stats.Jul 2 2018, 4:30 PM
CommunityTechBot closed this task as Resolved.
CommunityTechBot assigned this task to mpopov.
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot removed a subscriber: mpopov.