Page MenuHomePhabricator

How to get display statistics of the content publised on Commons
Closed, DuplicatePublic

Description

Posting on behalf of WMDE's Public Policy and Legal Team.

WMDE's Public Policy and Legal Team is working on the cooperation with publicly-funded broadcasters in Germany on getting high-quality content (interviews, images, data visualizations) published to Wikimedia Commons as Public Domain.
Broadcasters are interested in the data on the usage of the content their providing.

Hence, we would like to be able (using a tool, query, etc) to get statistics on the usage of the content on Wikimedia Commons, ie. how many times the particular video, audio file, or a picture have been displayed, including a "preview" link embeded in Wikipedia article (i.e. not only how many the file page on Commons has been opened).
We are not aware of the way to get such data, so we would appreciate help with this.
Is there a tool/site which we could use to get the data we're looking for? Is there the API we could be using for this? Any other thoughts?

We are not interested in any tracking data, just when and how often the content is displayed.

We're happy to provide more details on our needs. Please either comment here, or be sending an email to bernd.fiedler [at] wikimedia [dot] de

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 3 2018, 2:13 PM

Basic usage stats of media is available via our mediacounts dataset (https://dumps.wikimedia.org/other/mediacounts/). We have a task to add this to the pageview API so it's easier to get the data and visualize: https://phabricator.wikimedia.org/T88775

Milimetric moved this task from Incoming to Radar on the Analytics board.Aug 9 2018, 3:47 PM

Thanks @Milimetric !

My colleagues are looking for some kind of API that can be more "interactively" asked. So the dump solution would sadly not be an option here. In any case thank you for mentioning the place where the data are.

Do you maybe know when T88775 might be expected to be tackled? Does not seem to be the highest priority item for Analytics, does it?

No @WMDE-leszek we have a lot of other stuff we need to do first. It is a relatively small task, so I'd be happy to mentor someone doing it. I'll also try to grab it when I'm ahead on my other work, but that somehow never seems to happen :/

Tgr added a subscriber: Tgr.EditedJan 20 2019, 10:33 PM

Duplicate of T210313: Statistics for views of individual Wikimedia images? As mentioned there, there's an unofficial API.

Yeah, going to close as a duplicate of that. The main obstacle now is expanding storage on the API cluster to fit the mediacounts dataset.