Page MenuHomePhabricator

Add per-country maps usage graphs to the maps dashboard
Closed, ResolvedPublic6 Story Points

Description

In order to have a better understanding of the maps userbase, we should add another graph that will show per country maps usage. Even though we could analyse both unique users (unique IP + user agent combination), as well as all tiles, the unique users should be enough for now.

Goal 1: show top 9 countries plus 1 for "other" as lines.
Goal 2: allow drill-down into any country and to select which countries to show

We should probably collect data for both, but only implement goal 1 for now.

Event Timeline

Yurik created this task.Nov 23 2015, 9:52 PM
Yurik raised the priority of this task from to Needs Triage.
Yurik updated the task description. (Show Details)
Yurik added projects: Maps, Discovery.
Yurik added subscribers: Yurik, Ironholds, Tfinc, MaxSem.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptNov 23 2015, 9:52 PM
Ironholds set Security to None.Nov 23 2015, 9:53 PM
Ironholds added a subscriber: Deskana.

IP and user agent are nothing like unique users, though; they're massively inaccurate.

You should collect data for neither and implement neither until it's been prioritised by @Deskana, who I have added to this ticket since he's the product manager for the feature.

Yurik added a comment.Nov 23 2015, 9:59 PM

@Ironholds, yes, well aware of that - our "users" graph on the dashboard is not as exact as we would like it to be, but good enough approximation until we have something better. As for data - I do not "collect" anything. You do it as part of your dashboard scripts.

Deskana renamed this task from Add per-country maps usage graphs to Add per-country maps usage graphs to the maps dashboard.Dec 29 2015, 10:36 PM
Deskana triaged this task as Low priority.
Deskana moved this task from Needs triage to Analysis on the Discovery board.

@Deskana do we want to do this? Should Deborah make the call?

Now that we have figured out how to do this in a straightforward manner (see T123347), it should be fairly straightforward to knock this one out. Pulling into the sprint. It's fairly low priority at the minute, though.

Deskana moved this task from Analysis to On Sprint Board on the Discovery board.Feb 2 2016, 9:20 PM
Yurik moved this task from All map-related tasks to Analytics on the Maps board.Feb 7 2016, 10:10 PM
Ironholds set the point value for this task to 3.
Ironholds moved this task from Backlog to In progress on the Discovery-Analysis (Current work) board.
Yurik added a comment.Feb 19 2016, 2:03 PM

Would it be possible so that this data is exposed to grafana dashboard?

I think we covered the grafana question.

Looking at this it's going to be sliightly more complicated than initially expected; the EL schema doesn't expose country data, so we're gonna have to rely on Hive. Not a problem but will require some more work.

Change 272797 had a related patch set uploaded (by OliverKeyes):
Add data collection scripts for maps users-per-country

https://gerrit.wikimedia.org/r/272797

Change 272797 merged by Bearloga:
Add data collection scripts for maps users-per-country

https://gerrit.wikimedia.org/r/272797

We're only collecting for Goal 1 at the moment since Goal 2 has privacy implications.

Change 273511 had a related patch set uploaded (by OliverKeyes):
Fix geodata retrieval code for Maps

https://gerrit.wikimedia.org/r/273511

Change 273511 merged by Bearloga:
Fix geodata retrieval code for Maps

https://gerrit.wikimedia.org/r/273511

Ironholds changed the point value for this task from 3 to 5.Feb 26 2016, 9:56 PM

Change 273526 had a related patch set uploaded (by OliverKeyes):
[WIP] Add geographic visualisation code

https://gerrit.wikimedia.org/r/273526

mpopov added a subscriber: mpopov.Feb 29 2016, 5:51 PM

@Deskana @Yurik We've noticed that having daily tile usage data going all the way back to September is impacting the performance of the Maps dashboard, at least for the first user who opens the dashboard on a given day. Do y'all need that kind of granularity?

I'm thinking maybe we could have a 30-day rolling window for daily data and have a separate view that's just monthly median values?

Alternatively, we could get rid of aggregation by tile size and just have a daily tiles served, users, and median tiles per user. Do y'all use the tiles by zoom level view (http://discovery.wmflabs.org/maps/#tiles_total_by_zoom)?

Yurik added a comment.Feb 29 2016, 6:14 PM

@mpopov, I would like to keep that data for the future usage, but it is not needed to be shown on the graph. So if you can keep it in the database and tell me how to access it, you can make it into a slide window.

mpopov added a comment.EditedFeb 29 2016, 6:38 PM

@Yurik okay, thanks

@Ironholds Okay, do you want to have a non-visualized dataset that has all the daily data and a visualized dataset that just has a rolling 30 day window? (For geo data and tile usage data.) In both cases, we can have two writing calls: conditional_write (for all data) and conditional_rewrite (for windowed data). Thoughts?

Works for me.

Ironholds changed the point value for this task from 5 to 6.

Change 273998 had a related patch set uploaded (by OliverKeyes):
Add code to generate a rolling window of data

https://gerrit.wikimedia.org/r/273998

Change 273998 merged by Bearloga:
Add code to generate a rolling window of data

https://gerrit.wikimedia.org/r/273998

Change 273526 merged by Bearloga:
Add geographic visualisation code

https://gerrit.wikimedia.org/r/273526

Yurik added a comment.Mar 1 2016, 7:25 PM

where can i see it?

mpopov added a comment.EditedMar 2 2016, 7:07 PM

@Yurik http://discovery-beta.wmflabs.org/maps/#geo_breakdown

@Ironholds Looks like the geo data needs backfilling for Feb 12-28. Also there needs to be a missing data check for the geo data.

mpopov added a comment.Mar 3 2016, 8:47 PM

This update will be deployed to production (discovery.wmflabs.org) in the next deployment "train" (we prefer to bundle up multiple dashboard deployments).

Deskana closed this task as Resolved.Mar 31 2016, 10:27 PM