In T134301, we looked at the distribution of session lengths of Wikipedia.org Portal visitors in May 2016, using the event logging data collected through [[ https://meta.wikimedia.org/wiki/Schema:WikipediaPortal | this schema ]]. The codebase for this analysis is [[ https://github.com/wikimedia-research/Discovery-Research-Portal/tree/master/Analyses/Session%20Length | on GitHub ]]. The first draft of [[ https://github.com/wikimedia-research/Discovery-Research-Portal/blob/master/Analyses/Session%20Length/report.pdf | the report ]] never got properly finished (published to [[ https://commons.wikimedia.org/wiki/Main_Page | Commons ]] like [[ https://commons.wikimedia.org/wiki/File:Report_on_Cirrus_Search_TextCat_AB_Test_-_Language_Detection_on_English,_French,_Spanish,_Italian,_and_German_Wikipedias.pdf | some ]] [[ https://commons.wikimedia.org/wiki/File:Wikipedia_Portal_Test_of_Language_Detection_and_Primary_Link_Resorting.pdf | other ]] [[ https://commons.wikimedia.org/wiki/File:From_Zero_to_Hero_-_Anticipating_Zero_Results_From_Query_Features,_Ignoring_Content.pdf | reports ]]).
Your task is to reproduce the analysis but for June and July data (event logging keeps a 90-day backlog). If you can figure out additional insights that were not in the original report, great! You could even see if the language detection deployment (T133432) on June 2nd had an effect on session lengths.
Feel free to make a folder called "Session Lengths v2" in https://github.com/wikimedia-research/Discovery-Research-Portal/tree/master/Analyses to store your code and report in.