Page MenuHomePhabricator

Provide weekly top pageviews stats
Open, NormalPublic

Description

Revised request of https://phabricator.wikimedia.org/T133176 based on a discussion on IRC.

The hope is to have the topviews endpoint allow a week parameter for the given year, where you would then leave month and day blank.

This request stems from the fact that The Signpost, a highly visible publication for the English Wikipedia, is still using the less accurate data dumps for their weekly traffic report.

I have created Topviews Analysis which allows you to put in an arbitrary date range (max one month), but there's the caveat in that the numbers you see are summations of the top views for each individual day. This means if a given page was not in the top 1000 within that date range, the total number of pageviews will be off.

Also mind you we wouldn't object to a limit parameter, should that help with performance. We generally are only interested in the 100 pages or so. I understand this top views data is precomputed in Cassandra, so perhaps a limit option wouldn't help, but throwing it out there just in case.

Thanks!

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 25 2016, 6:14 PM
Nuria added a subscriber: Nuria.Apr 27 2016, 4:38 PM

A new job to add into cassandra loading. More than that a big change on how to present data vi api. Thus far we only present data daily and monthly.

@Nuria any chance we could triage this or give an estimate as to feasibility/likelihood of actually happening? The Signpost is now relying on Topviews and it hurts to know we're giving them sub-par data :(

I realize modifying the API endpoint is tricky... e.g. /pageviews/top/:wiki/:access/:year/:week/:month/:day or some variation is just confusing. Perhaps a new endpoint, /pageviews/top-weekly/en.wikipedia/all-access/:year/:week?

Nuria added a comment.Aug 12 2016, 6:08 PM

@MusikAnimal : I think we will not be working on this item this quarter or next as we are focusing on edit data. Note that while we have several sources of pageview data the analytics team does not to provide any edit data feeds.

Next quarter our efforts around pageview API will be centered on capacity planning and counting pageviews for wikis we have not been doing so to date: https://phabricator.wikimedia.org/T130249

Once these two issues are tackled we can expand our feature set for the Pageview API.

@MusikAnimal The Signpost actually switched away from a weekly publication schedule recently. Is the traffic report going to stick to the weekly format for the foreseeable future?

Also, regarding "sub-par" data: How likely is it that a given page is in the top 10 for the week, but was not in the top 1000 of the seven individual days?

MusikAnimal added a comment.EditedAug 14 2016, 4:32 AM

@MusikAnimal The Signpost actually switched away from a weekly publication schedule recently. Is the traffic report going to stick to the weekly format for the foreseeable future?

You are correct! I believe they are probably still doing a weekly format because that is how it historically has been made available with User:West.andrew.g/Popular pages.

Also, regarding "sub-par" data: How likely is it that a given page is in the top 10 for the week, but was not in the top 1000 of the seven individual days?

For the top 10 it is usually a trivial difference, but they are publishing the Top 25. Even then the ranking is usually right, but some pages can be off by tens of thousands of views.

All things considered, I've decided I'm going to do away with the hack I made, which was a bad idea to begin with. Monthly and daily is quite sufficient for most people, and from recent discussions it sounds like West.andrew.g is going to have his weekly report working off the new data source soon. Apologies for suggesting weekly stats are anything urgent. I still think it would be a nice addition but there is certainly no rush :)

Milimetric triaged this task as Normal priority.Nov 14 2016, 4:40 PM
Milimetric moved this task from Modern Event Platform to Dashiki on the Analytics board.

Requested 5 minutes ago on IRC wikipedia-fr. The goal is to easily update the weekly on-wiki magazine.

Feldo added a subscriber: Feldo.Nov 13 2017, 5:02 PM
fdans moved this task from Backlog (Later) to Dashiki on the Analytics board.Jan 8 2018, 5:13 PM
Milimetric moved this task from Dashiki to Incoming on the Analytics board.Apr 2 2018, 3:32 PM
Milimetric moved this task from Dashiki to Incoming on the Analytics board.
Milimetric moved this task from Dashiki to Incoming on the Analytics board.
Milimetric moved this task from Dashiki to Incoming on the Analytics board.
Nuria moved this task from Incoming to Blocked on the Analytics board.Apr 5 2018, 5:06 PM
Nuria moved this task from Blocked to Deprioritized on the Analytics board.