Page MenuHomePhabricator

Set up graphs and dumps for ExtensionDistributor download statistics {frog} [3 pts]
Closed, ResolvedPublic

Description

Hello analytics people! With rEXDIa7f5e0df967e: Record downloads with EventLogging ExtensionDistributor is now sending download statistics to EventLogging. We'd like to set up graphs and dumps of the data. Help/guidance/docs appreciated :)

The new dashboard is here: http://extdist-reportcard.wmflabs.org/

BTW here is the documentation for setting up a dashboard: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards

Event Timeline

Legoktm created this task.Jun 2 2015, 11:51 PM
Legoktm raised the priority of this task from to Low.
Legoktm updated the task description. (Show Details)
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 2 2015, 11:51 PM

Quick query:

SELECT
	LEFT(timestamp, 6) AS month,
	event_type,
	event_name,
	event_version,
	COUNT(uuid) AS count

FROM
	ExtDistDownloads_12369387

GROUP BY
	month,
	event_type,
	event_name,
	event_version

ORDER BY
	month ASC,
	event_type ASC,
	event_name,
	event_version;

Needs:

  • Putting in a repo
  • Running in cron
  • Publishing
  • Charting.

Can an existing domain be reused for the charts? (/me wonders why we still don't have a shared limn instance.)

Extension that makes a special page on MW wiki?

madhuvishy set Security to None.

It seems like the graphs on http://edit-reportcard.wmflabs.org/ are outdated, so we can remove those, but would you want to add this to a new tab on that dashboard? If not, please let us know what you'd like to see, how often you are going to be looking at the data. And anything else that would help us know how to set this up.

@Legoktm - could you answer Milimetric's question? Thanks!

It seems like the graphs on http://edit-reportcard.wmflabs.org/ are outdated, so we can remove those, but would you want to add this to a new tab on that dashboard? If not, please let us know what you'd like to see, how often you are going to be looking at the data. And anything else that would help us know how to set this up.

For now I think the most important thing we'd like is to get a dump of the aggregated data (similar to the query James posted) published, and then we can figure out how to graph it.

To help with both the graph creation and generating results for that query on a periodic basis, I need to know if you will be sharing a repository with the editing team [1] or you'd like your own repository. If you'd like your own repository, let me know what you'd like to name it. It has to be limn-[something]-data, but you can pick the [something].

[1] https://github.com/wikimedia/analytics-limn-edit-data/

Yeah, we should probably keep it separate. How about limn-extdist-data?

Ok, cool, I'll add this to our board and get to it. Probably not today but Monday.

Milimetric renamed this task from Set up graphs and dumps for ExtensionDistributor download statistics to Set up graphs and dumps for ExtensionDistributor download statistics {frog} [3 pts].Jun 26 2015, 9:29 PM
Milimetric edited projects, added Analytics-Kanban; removed Analytics-Backlog.
Peachey88 renamed this task from Set up graphs and dumps for ExtensionDistributor download statistics {frog} [3 pts] to Set up graphs and dumps for ExtensionDistributor download statistics.Jun 27 2015, 12:37 AM

Please don't rename this card, it's assigned to us and we use some conventions to make sense of our wall.

Milimetric renamed this task from Set up graphs and dumps for ExtensionDistributor download statistics to Set up graphs and dumps for ExtensionDistributor download statistics {frog} [3 pts].Jun 29 2015, 11:45 PM
Milimetric claimed this task.

Change 221799 had a related patch set uploaded (by Milimetric):
Add downloads query and config

https://gerrit.wikimedia.org/r/221799

Change 221799 merged by Milimetric:
Add downloads query and config

https://gerrit.wikimedia.org/r/221799

Change 221801 had a related patch set uploaded (by Milimetric):
Add a new limn datafile generator: extdist

https://gerrit.wikimedia.org/r/221801

I've done everything on my end. There's a problem with wikitech so I can't setup the proxy to get you http://extdist-reportcard.wmflabs.org, but once that's fixed I'll do that. And ops needs to merge the puppet change I made above. After that everything should work. I set the start date as 7/1 just to be nice and clean, let me know if you have any questions.

Change 221801 merged by Ottomata:
Add a new limn datafile generator: extdist

https://gerrit.wikimedia.org/r/221801

kevinator closed this task as Resolved.Jul 1 2015, 3:45 PM
kevinator updated the task description. (Show Details)
kevinator added a subscriber: kevinator.

I double checked and your dashboard is updating as of last night: http://extdist-reportcard.wmflabs.org/

It will have a single data point until next month, since it looked like you wanted monthly numbers. I had to limit the query to look at the top 10 skins and top 10 extensions for now, since the graphs couldn't handle all the extensions crammed together. Think that over and let me know if you want to see something differently.

Also, James's query will continue to run but there's no way to show it in a dashboard. You can find the output here: http://datasets.wikimedia.org/limn-public-data/extdist/datafiles/downloads.csv

If you want to share your dashboard with the world: https://meta.wikimedia.org/wiki/Research:Data/Dashboards

If you want to share your dashboard with the world: https://meta.wikimedia.org/wiki/Research:Data/Dashboards

Why not rename the domain to something more generic that can hosts dashboards on other topics in the future? So that people don't have to worry about their substantial contribution to the thermal death of the universe whenever they try to add a single graph somewhere.

I'm happy to rename it, that's just the name that I was given above. And as much as I love naming debates I don't know enough about this area of the platform to have a solid opinion.

Please don't rename this card, it's assigned to us and we use some conventions to make sense of our wall.

Where are these conventions documented? I'm curious what {frog} means.

If you look at the first column on our main board: https://phabricator.wikimedia.org/tag/analytics-kanban/ you can see one pseudo-task for each {animal}. The task title includes the title of the project that the {animal} corresponds to. Essentially it's an easy way to visually inspect our board to see:

  • we're not trying to work on too many projects at the same time
  • we're working on the highest priority projects (the projects are in priority order in the leftmost Projects column)

Ideally, Phab would just have tags but we make do this way.

Also, James's query will continue to run but there's no way to show it in a dashboard. You can find the output here: http://datasets.wikimedia.org/limn-public-data/extdist/datafiles/downloads.csv

This CSV only has one row in it...is that supposed to happen?

Milimetric reopened this task as Open.Jul 28 2015, 4:18 PM

something went really wrong with this... I'm seeing Japanese characters in the CSV files now... re-opening.

Milimetric moved this task from Done to In Progress on the Analytics-Kanban board.Jul 28 2015, 4:18 PM
Milimetric moved this task from In Progress to Done on the Analytics-Kanban board.Jul 28 2015, 7:21 PM
kevinator closed this task as Resolved.Jul 30 2015, 3:23 PM