Page MenuHomePhabricator

Investigation: Pageview Stats tool
Closed, ResolvedPublic3 Estimated Story Points

Description

Investigation card to learn what's involved in a top 10 wish. More info coming soon.

https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey/Miscellaneous#Pageview_Stats_tool

Let's talk to community members and Analytics and come up with specific requirements for a pageview stats tool.

Question: Would creating an "official" implementation of http://stats.grok.se/ be adequate for most use cases or are other features needed? For example, mobile vs desktop views, comparing articles, aggregating by week/month, etc.


The original survey proposal states:

Wikipedia uses the old stats.grok.se that should be patched to be used correctly from the other wikis. Several bugs have been highlighted long time ago, but no one took them in charge.
On the other hand recently has been developed wikiviewstats that is a more complete and flexible, graphic tool. Unfortunately, it has been stopped, and no one was able to take it back on track.
I suppose that should be quicker to fix the above issues instead of writing from scratch a brand new stats tool able to monitor the accesses of any articles (fundamental to understand the visitor's insterests), however any of the two choices would be a good improvement.

Several of the proposal comments ask for an "official, WMF-maintained tool".

A look at stats.grok.se
Possible solutions to this problem
  1. Patching stats.grok.se and/or wikiviewstats:
    • Pros: Already established and widely liked tools
    • Cons: Overhead of working with legacy code, almost everything would need to be changed (down to the api being called for data), dashiki is the new preferred way of doing these stats as it makes it easy to embed these graphs in other tools - Analytics is planning to rebuild stats.wikimedia.org and would likely want to embed these page view stats in there
  1. Having an extension with a Special page:
    • Pros: On-wiki data as preferred by a lot of the community
    • Cons: It would limit us displaying stats on per-wiki basis, extra overhead of having the extension deployed on every wiki.
  1. Creating a new tool on Labs:
    • Pros: Ability to experiment with UI and features, ability to use dashiki
    • Cons: Will need to work from scratch

After weighing all pros and cons and discussions with Analytics folks, creating a new tool feels like the best decision.

Basic features for this tool
  1. Uses dashiki to display graphs.
  2. Custom date-range filter.
  3. Ability to view stats by day, week, month, year
Nice to have features
  1. Internationalization for the tool (preferably using TranslateWiki)
  2. Ability to switch between different kind of graphs
  3. Ability to get Top X viewed pages by namespace on a wiki
  4. Ability to compare different pages (over different wikis or same)

From discussion with CE folks:

  1. Ability to compare view vs edit stats
  2. Ability to see cumulative stats for a page for all the languages it exists in
  3. Ability to see cumulative stats for a page and its subpages
  4. Ability to see page views by category (http://tools.wmflabs.org/glamtools/treeviews/)
  5. Top 10/100 most-viewed/most edited articles and similar fun stats (like this and this)
  6. Compatibility with PagePile (http://tools.wmflabs.org/pagepile/)
  7. Ability to differentiate stats between WMF staffers and other users.
  8. Ability to view redirect traffic stats separate from the article traffic stats.
For inspiration
  1. https://analytics.wmflabs.org/demo/pageview-api/
  2. Wikiviewstats (now defunct)
  3. http://stats.grok.se

I will keep updating this as I have follow-up conversations with CE folks.

Notes from meeting with Jan: https://etherpad.wikimedia.org/p/pageview-stats-tool

Related Objects

Event Timeline

DannyH raised the priority of this task from to Medium.
DannyH updated the task description. (Show Details)
DannyH moved this task to Older: Team Work on the Community-Tech board.
DannyH added a subscriber: DannyH.

Please take a look at the newly launched pageview API:
https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageview_API

There are clients for it in javascript/python and R.

It should provide all info you need to resolve this ticket.

kaldari updated the task description. (Show Details)

The original survey proposal states:

Wikipedia uses the old stats.grok.se that should be patched to be used correctly from the other wikis. Several bugs have been highlighted long time ago, but no one took them in charge.
On the other hand recently has been developed wikiviewstats that is a more complete and flexible, graphic tool. Unfortunately, it has been stopped, and no one was able to take it back on track.
I suppose that should be quicker to fix the above issues instead of writing from scratch a brand new stats tool able to monitor the accesses of any articles (fundamental to understand the visitor's insterests), however any of the two choices would be a good improvement.

Several of the proposal comments ask for an "official, WMF-maintained tool".

A look at stats.grok.se
Possible solutions to this problem
  1. Patching stats.grok.se and/or wikiviewstats:
    • Pros: Already established and widely liked tools
    • Cons: Overhead of working with legacy code, almost everything would need to be changed (down to the api being called for data), dashiki is the new preferred way of doing these stats as it makes it easy to embed these graphs in other tools - Analytics is planning to rebuild stats.wikimedia.org and would likely want to embed these page view stats in there
  1. Having an extension with a Special page:
    • Pros: On-wiki data as preferred by a lot of the community
    • Cons: It would limit us displaying stats on per-wiki basis, extra overhead of having the extension deployed on every wiki.
  1. Creating a new tool on Labs:
    • Pros: Ability to experiment with UI and features, ability to use dashiki
    • Cons: Will need to work from scratch

After weighing all pros and cons and discussions with Analytics folks, creating a new tool feels like the best decision.

Basic features for this tool
  1. Uses dashiki to display graphs.
  2. Custom date-range filter.
  3. Ability to compare different pages (over different wikis or same)
  4. Ability to view stats by day, week, month, year
Nice to have features
  1. i18n for the tool
  2. Ability to switch between different kind of graphs
  3. Ability to filter by namespace
UI inspirations
  1. https://analytics.wmflabs.org/demo/pageview-api/
  2. Wikiviewstats (now defunct)
  3. http://stats.grok.se

I will keep updating this as I have follow-up conversations with CE folks.

UI inspirations

Please make sure to involve design when doing UI work, ideally for a solid tool we would work with a designer to 1) provide mocks for use cases 2) iterate on mocks (cheaper than iterating on code) and 3) proceed to implementation.

UI inspirations

Please make sure to involve design when doing UI work, ideally for a solid tool we would work with a designer to 1) provide mocks for use cases 2) iterate on mocks (cheaper than iterating on code) and 3) proceed to implementation.

Definitely. I'm trying to narrow down the use-cases first and then get in touch with Design.

Ability to filter by namespace

Not sure what this means exactly. Shouldn't it just display the stats for any page, regardless of namespace?

Ability to compare different pages (over different wikis or same)

This seems like more of a "nice to have" feature.

Ability to filter by namespace

Not sure what this means exactly. Shouldn't it just display the stats for any page, regardless of namespace?

This basically means to view top x by namespace. I should have clarified that.

Ability to compare different pages (over different wikis or same)

This seems like more of a "nice to have" feature.

Okay. I'll move it. I have something like https://analytics.wmflabs.org/demo/pageview-api/ in mind when I say compare different pages.

@NiharikaKohli: When you update the investigation notes above, please also change i18n to "multi-language support" or "internationalization" so that the students working with WMSE will know what it means.

I think this investigation is pretty much done. Any more comments, @kaldari, @DannyH?

@NiharikaKohli Closing the investigation ticket, well done. :)

I put the findings on the project page:

https://meta.wikimedia.org/wiki/Community_Tech/Pageview_stats_tool