Page MenuHomePhabricator

Create download statistics dashboard for tarballs
Open, LowPublic

Description

We should have a dashboard that tracks tarball downloads for MediaWiki by version over time. This would be a small step towards identifying the size of our 3rd party user community. It would be nice if the same dashboard tracked extension and skin tarball downloads as well.

Event Timeline

bd808 raised the priority of this task from to Needs Triage.
bd808 updated the task description. (Show Details)
bd808 added a project: User-bd808.
bd808 subscribed.
bd808 triaged this task as Low priority.Feb 26 2016, 4:07 AM
bd808 assigned this task to Legoktm.

This is exactly what I was hoping for oh so many years ago. :)

Screen Shot 2017-11-17 at 14.07.00.png (610×1 px, 75 KB)

bd808 removed Legoktm as the assignee of this task.

I incorrectly assumed that the Extension Distributor dashboard also covered the mediawiki-core tarballs. We still need to track that somewhere. We do have a nice example to follow of the data that we would like to capture and report however.

I guess we need to get the data out of the varnish web request logs, have something that parses the URLs and then aggregates the stats/maybe puts them in statsd?

I just did a quick check and it doesn't look like (to me) requests to releases.wikimedia.org land in hadoop.

The two options are:

  1. We do something evil where we parse the raw text request logs and send data to graphite from that. I do something similar to this for the wikidata dump download number which get rsynced to an analytics server. https://github.com/wikimedia/analytics-wmde-scripts/blob/master/src/wikidata/dumpDownloads.php
  2. We get the request logs sent into hadoop and just query them using the analytics infrastructure and refinery like https://github.com/wikimedia/analytics-refinery/commit/b1086e1134cd24c900e968fb7ce4c931563261ec and https://github.com/wikimedia/analytics-refinery-source/commit/2bba1b4b4a7ce6116c62d5ac244f5357aa92da33