Page MenuHomePhabricator

Investigation: Add mobile pageview stats to Mr.Z-bot's popular pages reports [AOI]
Closed, ResolvedPublic

Description

Per T108425, we should investigate how hard it would be to incorporate mobile pageview data into Mr.Z-bot's popular pages reports. Right now these reports apparently only show numbers for desktop pageviews.

Pagecount data that includes mobile pageviews can be found at http://dumps.wikimedia.org/other/pagecounts-all-sites/. Would this data be easily digestible by the bot or would it require significant data parsing? Hopefully this question can be answered by asking people like AlexZ (the maintainer of the bot), rather than digging into all the code.

Some useful links:

Timebox: 4 hours

Event Timeline

kaldari created this task.Sep 14 2015, 6:47 PM
kaldari updated the task description. (Show Details)
kaldari raised the priority of this task from to Low.
kaldari added a subscriber: kaldari.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 14 2015, 6:47 PM

See T108425 for background discussion, including discussion of the various datasets and how they are being used.

Niharika claimed this task.

I'm not an expert in the area, but there seems to be an ongoing effort to provide a public pageview API to retrieve mobile pageviews and more.

Instead of investigating right now, maybe it would be better to postpone this until that API is available, making the task simpler?

@Qgil, agreed. I just did some preliminary investigation on the tool's code and made an effort to reach out to the maintainer.


Source code: https://github.com/alexz-enwp/popularpages
Data: As mentioned on https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/Mr.Z-bot_4, the data comes from domas' data: http://dumps.wikimedia.org/other/pagecounts-raw/ - the hourly dump file is downloaded, aggregated, filtered and the data is inserted into a table (tool's own) which is then published.
Investigation output:
The tool runs off data from http://dumps.wikimedia.org/other/pagecounts-raw/ which currently do not include page views from mobile.
http://dumps.wikimedia.org/other/pagecounts-all-sites/ is an alternative to pagecounts-raw which also includes mobile views, keeping the format same. Andrew West's tool https://en.wikipedia.org/wiki/User:West.andrew.g/Popular_medical_pages is built on pagecounts-all-sites data and hence contains mobile view stats.

There are a couple of ways to add mobile view stats to Mr.Z's bot:

  1. Switch the tool from pagecounts-raw to pagecounts-all-sites. Then it'll be a simple additional query for mobile stats in the received dataset and adding a column to the result page. Estimated story points for this task would be 5.

OR

  1. Wait for the Analytics team to launch the API to query improved stats for mobile hits and incorporate them into the tool. As far as I can see, the tool does not use the API anywhere so far. The Analytics team will be launching the API in about a month's time.

I've tried contacting AlexZ, because the first step would be to get access to the tool's repo to make any changes. He doesn't seem to be around very much on wiki lately.

kaldari closed this task as Resolved.Sep 16 2015, 5:41 PM

@NiharikaKohli: Nice work. I think this gives us a good idea of the options. And thanks for reaching out to AlexZ. Hopefully he'll pop up eventually.

@NiharikaKohli: I just got a lot of really useful info from milimetric. He says that the first version of the pagestats API will not have monthly aggregation, but it will have daily aggregation. Also he estimates that the API will be ready by Oct. 9. Have you heard anything from AlexZ?

@NiharikaKohli: I just got a lot of really useful info from milimetric. He says that the first version of the pagestats API will not have monthly aggregation, but it will have daily aggregation. Also he estimates that the API will be ready by Oct. 9. Have you heard anything from AlexZ?

I think you already got the answer but for the record, no. I will try emailing him next.

@NiharikaKohli: I just tried emailing him myself, so no need.

@kaldari, okay.

Unrelated comment: I think we should not consider our Investigation tasks as Resolved until we have reached a definitive conclusion about further course of action on the task.

kaldari reopened this task as Open.EditedOct 15 2015, 9:54 PM

Reopening, since I was able to contact Alex and he is amenable to us helping to improve his tools. I'm going to follow up with some more specific questions. (https://en.wikipedia.org/wiki/User_talk:Mr.Z-man#Future_of_Mr.Z-bot_tasks)

The API is launched, spec is final, we're still working on loading in all the data (had to switch schemas because it used way more space than we calculated). All the details are available here: https://phabricator.wikimedia.org/T44259#1747860

DannyH renamed this task from [AOI] Investigation: Add mobile pageview stats to Mr.Z-bot's popular pages reports to Investigation: Add mobile pageview stats to Mr.Z-bot's popular pages reports [AOI].Oct 28 2015, 7:06 PM

According to https://en.wikipedia.org/wiki/User_talk:Mr.Z-man#Future_of_Mr.Z-bot_tasks, Alex is open to the idea of adding us as project maintainers. I'll go ahead and create a card for implementing pageview API support in popularpages.

@kaldari, @DannyH, does this ticket need an estimation? We already have a blocking task now, I see. Shouldn't we just close this one as done? I did some investigation on this (in a prior comment).

kaldari closed this task as Resolved.Oct 29 2015, 9:12 PM
kaldari moved this task from In Development to Q1 2018-19 on the Community-Tech-Sprint board.