Page MenuHomePhabricator

Collect analytics data such as pageview
Closed, ResolvedPublic



Pageview information is not available in the database and so this issue has been separated from #9 . Here pageview means how many times pages that include certain Scribunto modules have been viewed. Pageview thus can be used as a metric to measure the frequency of the use of a Scribunto modules.
Similarly, other such analytics information can be collected from API and database for the modules too.


  • Get page view of all pages that transclude a module
  • Save pageview for previous days (as much available)
    • Change code to use REST API which has pageview count for longer time
    • Add pageviews for each subsequent day
  • Test fetching (both daily and overall)
  • Add cron job (to fetch pageview for the latest day)

Both REST and PHP APIs were tested. Although the PHP api provides data for last 60 days only (REST gives pageviews from ~2015) it seemed to be more reliable and robust to multiple calls. Whereras REST API started giving 404s really fast and so getting pageviews for *that many* pages would take forever.

Test run

TIll now, the pageviews script seems to be running without errors but is taking over a week to fetch all data. Timing will be updated when script finishes running.


Draft notebook for analysis Aishas Notebook III

Event Timeline

tanny411 triaged this task as High priority.Jan 7 2021, 9:22 AM
tanny411 created this task.
tanny411 moved this task from To triage to Data Science work on the Abstract Wikipedia team board.

@tanny411: Hi! This task has been assigned to you a while ago. Could you maybe share an update? Do you still plan to work on this task, or do you need any help?

If this task has been resolved in the meantime: Please update the task status (via Add Action...Change Status in the dropdown menu).
If this task is not resolved and only if you do not plan to work on this task anymore: Please consider removing yourself as assignee (via Add Action...Assign / Claim in the dropdown menu): That would allow others to work on this (in theory), as others won't think that someone is already working on this. Thanks! :)