Currently the pageview API only tracks what we call "knowledge" pageviews, we have an outstanding request
to track requests to other systems, wikis (like "outreach").
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Nuria | T130249 Count pageviews for all wikis behind varnish | |||
Duplicate | None | T132313 Add support for outreachwiki to pageviews API | |||
Declined | Henrik | T111662 stats.grok.se doesn't offer mediawiki.org page view stats | |||
Resolved | None | T126367 Provide data on top pageviews across all projects daily |
Event Timeline
So you're only tracking details about some wikis, not others? You're not going to be able to get information about views on things like wikitech
So you're only tracking details about some wikis, not others? You're not going to be able to get information about views on things like wikitech
Clarification: pageview API only serves counts about "knowledge" wikis, defined in pageview definition: https://meta.wikimedia.org/wiki/Research:Page_view
This task is about being able to count requests on other systems . All requests that hit varnish are count-able if that makes sense.
Would this then also include all of the chapters wikis, outreach etc?
Yes, it would. We do not have an ETA for this item though, we were hoping to get to it in the next three months but we are not sure we can get there.
@Nuria do we have a sense of when this will happen? There are a fair number of dependencies on having this data available.
@Sadads: not for a at least 3 months, we are focusing of edit data after having worked on pageview data for a while.
As I said before (and I understand this is less convenient) we data on cluster for all wikis for the last 60 days at all times so our work on this regard should not block you, you can get the data (in a less convenient fashion) right now.
Per our conversation with research (cc @DarTar and @Erik_Zachte) we are going to add "not knowledge wikis" to our pageview pipeline. For two reasons:
- magnitude of pageviews is really small, they will not affect regular stats. Also, we still have a whitelist mechanism so wikis that want to be excluded can be so.
- excluding these wikis creates more issues than it solves.
Thanks @Nuria thats good to know: community programs and events could
really use the data coming off these wikis.
Let's start with outreachwiki, nl.wikimedia, ru.wikimedia, be.wikimedia, strategy.
- Refactor the code on pageview definition that restricts counting to certain urls
- Add wikis to the whitelist.
Monitor propagation to pageview API. To be clear we cannot count pageviews retroactively as we do not have past data but we can acount them going forward.
good to know, that delays one of my projects then: which is fine, it looked
like it might have been too early in next quarter anyway, Alex
Change 316838 had a related patch set uploaded (by Nuria):
Adding oureach wikipedia to Pageview whitelist
Change 316845 had a related patch set uploaded (by Nuria):
Enhancing regex to support pageviews to non-knowledge wikis
Change 316845 merged by jenkins-bot:
Enhancing regex to support pageviews to non-knowledge wikis
Change 319084 had a related patch set uploaded (by Joal):
Update jar version for webrequest load job
Change 319105 had a related patch set uploaded (by Milimetric):
Include pageviews for all wikis in whitelist
Changes are on pageview hourly, waiting for changes to appear on pageview API to close ticket. cc @Sadads
See an example of a query returning data for pageview API: https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/outreach.wikimedia/all-access/user/Main_Page/daily/2016103100/2016110200 for outreach
Also, pageview tool displays pageviews for this data too: https://tools.wmflabs.org/pageviews-test/?project=outreach.wikimedia.org&platform=all-access&agent=user&range=latest-20&pages=Main_Page
Please have in mind that data for 1st of November will be partial.
cc @MusikAnimal so he is aware new projects have been added to the API