Page MenuHomePhabricator

Adding top counts for wiki projects (ex: WikiProject:Medicine) to pageview API
Open, NormalPublic21 Story Points

Description

This is the permanent addition of WikiProject data to the pageview API

Event Timeline

Nuria created this task.Jul 21 2016, 5:14 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 21 2016, 5:14 PM
Nuria added a comment.EditedJul 21 2016, 5:15 PM

Example:

User wants top pages for WikiProject:Medicine.

This explains the non-adhoc version of how to achieve counts:

Add WikiProject:Medicine as a project modifier to en.wikipedia.org queries on the pageview API. Need to think this through and how it would work on projects like commons and wikidata, but basically queries could ask for /en.wikipedia.org|WikiProject:Medicine/ and get top(1000) and per-project totals. We can get data for other languages by using Wikidata inter-language links. We can assume enwiki is authoritative for this purpose, even though it's not strictly true and we'll miss some articles with this assumption. It's better than nothing and we can improve the assumption when we have better data.

Nuria renamed this task from Adding top counts for wiki projects to pageview API to Adding top counts for wiki projects (ex: WikiProject:Medicine) to pageview API.Jul 21 2016, 5:15 PM
Nuria set the point value for this task to 21.
Milimetric triaged this task as Normal priority.Aug 1 2016, 4:43 PM
Milimetric updated the task description. (Show Details)
Milimetric moved this task from Incoming to Backlog (Later) on the Analytics board.
Nuria added a comment.Aug 1 2016, 4:44 PM

We need to tag pages with WikiProject in pageview hourly so that info is available when we load data into pageview API

FYI, on English Wikipedia at least, it is possible to query what pages belong to a WikiProject via the PageAssessments API (or directly from the page_assessments table in the enwiki database). For example, to get all the pages that belong to WikiProject Medicine:
https://en.wikipedia.org/w/api.php?action=query&list=projectpages&wppprojects=Medicine

This should be faster and more accurate than trying to use category membership or template links. Eventually this will be deployed on other wikis as well.

More info at https://www.mediawiki.org/wiki/Extension:PageAssessments.

kaldari reopened this task as Open.Apr 8 2017, 1:27 AM

This isn't quite resolved, although it may be a moot point now. The new popular page reports use the pageview API for the page view numbers, but still rely on the PageAssessments API for finding out which pages belong to which projects. This bug suggests adding a WikiProject parameter directly into the pageviews API, so that using the PageAssessments API isn't necessary.

fdans moved this task from Backlog (Later) to Dashiki on the Analytics board.Nov 2 2017, 3:50 PM
Milimetric moved this task from Dashiki to Incoming on the Analytics board.Apr 2 2018, 3:33 PM
Milimetric moved this task from Dashiki to Incoming on the Analytics board.
Nuria lowered the priority of this task from Normal to Low.Apr 5 2018, 4:53 PM
Nuria added a project: Pageviews-API.
Nuria moved this task from Incoming to Backlog (Later) on the Analytics board.
Milimetric raised the priority of this task from Low to Normal.Oct 11 2018, 4:58 PM
Milimetric moved this task from Backlog (Later) to Analytics Query Service on the Analytics board.

@Shizhao Do you have ideas how to handle problems that kaldari bumped? If not, shouldn't you unassign yourself now?

Nuria removed Shizhao as the assignee of this task.Oct 15 2018, 6:10 PM