Page MenuHomePhabricator

Wikipedia.org Portal Visitors' Session Lengths (Redux)
Closed, ResolvedPublic6 Estimated Story Points

Description

Background

In T134301, we looked at the distribution of session lengths of Wikipedia.org Portal visitors in May 2016, using the event logging data collected through this schema. The codebase for this analysis is on GitHub. The first draft of the report never got properly finished (ie: not published to Commons like some other reports).

Objective

Your task is to reproduce the analysis but for June and July data (event logging keeps a 90-day backlog). If you can figure out additional insights that were not in the original report, great! You could even see if the language detection deployment (T133432) on June 2nd had an effect on session lengths.

Feel free to make a folder called "Session Lengths v2" in https://github.com/wikimedia-research/Discovery-Research-Portal/tree/master/Analyses to store your code and report in. Once the report has been reviewed, it can be published on Commons and linked to the Analysis section of this page.

Event Timeline

debt updated the task description. (Show Details)
mpopov renamed this task from Wikipedia.org Portal Visitors' Session Lengths to Wikipedia.org Portal Visitors' Session Lengths (Redux).Aug 16 2016, 6:50 PM
debt triaged this task as Medium priority.Aug 16 2016, 8:15 PM

Reproduced:


There are so many interesting related problems to investigate based on this one, and I can probably create 2-3 sub tasks. @debt and @mpopov, please let me know if there are something else more important/urgent for me to do/learn, I can put this aside.


I put the investigation on os and browsers in a second report.

Would we be able to filter out the "other" page views and do this analysis again?

Other pageview treatment is listed in this ticket: T143605

@debt Let's talk about it this afternoon. I think event logging and page views are different.