Thu, Jun 13
How frequently do users use the "group" option either as a filter or as a user preference >
Tue, Jun 11
@kzimmerman I've completed calculating May 2019 metrics and added the updated numbers and analysis notes to the board deck.
Mon, Jun 10
Wed, Jun 5
@phuedx - Do you know which user property controls the "group" preference for this question?
How frequently do users use the "group" option either as a filter or as a user preference? priority high
Mon, Jun 3
I discussed the sampling rate for this with the Product Analytics team. Since all the eventlogging data was moved to Hadoop, there are fewer concerns about sending too much data from the Analytics Engineering side. Unless there are other concerns (happy to discuss), I think it's ok if we keep the sampling rate to 50% for navigation links outside the hamburger menu as well.
Fri, May 31
Good news! I figured out access to the user properties table. I'll work on the questions that require info in this table starting with the following one marked as high priority:
How many users have enabled enhanced RC?
How frequently do users use the "group" option either as a filter or as a user preference?
Thu, May 30
Wed, May 29
Tue, May 28
Tue, May 21
Awesome, thanks @mforns! Looks good. Just a few questions/comments:
Sun, May 19
May 14 2019
Which filters are used and how frequently?
@phuedx - Just confirming I'm currently working on Which filters are used and how frequently? Will post results soon
Apr 30 2019
I've updated the readers metrics for March 2019 (slides 24 and the readers interaction numbers for slide 25 ). Note: These are for the calendar month of March 2019 and not normalized as was done in previous months for readers data.
Apr 29 2019
Apr 26 2019
Pageview charts are updated through March 2019 and added to the current slide deck.
Apr 25 2019
Apr 24 2019
@kzimmerman - I added a stacked bar chart showing interactions for past four years through Feb 2019 to the draft slide deck. Let me know if you have any questions or need any adjustments for the metrics presentation.
Apr 18 2019
Apr 17 2019
And will create another task to add the user_tenure_field and also the edit_tags (Visualeditor, Wikitext, etc.) that will be added to the next snapshot of MediaWiki history.
If you think of any other dimension or metric to include, you could add it to that task.
Would that be OK?
Sounds good to me!
Data exploration results summarized and posted on meta. Marking as done but let me know if you have any questions!
Apr 16 2019
Thanks @mforns! And sorry for the delay. I'm reviewing the edits_hourly in turnilo and it looks good to me so far. The hourly resolution doesn't seem to be impacting the performance too much when I test adding various splits and filters so I'd recommend keeping it unless there are any major concerns.
Apr 1 2019
Mar 27 2019
Mar 26 2019
Here's the updated edit table schema with suggested transforms to ingest directly from mediawiki_history into Druid.
Mar 12 2019
Thanks for the update re the timeline. A meeting would be great - I’ll set up a time for us and @Neil_P._Quinn_WMF to meet this week if possible. I’ve worked with Neil to identify the simplified list of mediawiki_history dimensions and mapped those to druid expressions. I'll share with you soon and we can discuss at the meeting.
Feb 24 2019
Feb 21 2019
@mforns Thanks! Yes, happy to discuss and coordinate on this. I reviewed this task with @Neil_P._Quinn_WMF today. I'm going to first work on defining our desired dimensions and transformations based on the type of queries we'd want to run and how the data will be used, which might help inform the best method for loading the dataset. I’ll reach out to discuss once we have a better idea of the needed transforms if that works for you.
Feb 7 2019
Feb 6 2019
Jan 25 2019
Quick summary of work progress since my last post:
Jan 22 2019
Jan 21 2019
I redid the analysis of the desktop search metrics on Commons (clickthrough rate, zero results rate, and proportion of searches with clicks to see other pages of the search results) to determine changes following the bug fix deployed in September 2018. See summary of results posted to T188421#4897244. Please let me know if you have any questions or need any further details.
T189242 was fixed in late September 2018. To see how metrics have changed, I re-computed several desktop search metrics on Wikimedia Commons with available eventlogging data from October 18, 2018 to January 16, 2019. Metrics were compared to English Wikipedia desktop searches.
Jan 15 2019
Dec 12 2018
Oct 24 2018
I further investigated recent spikes in August 2018 identified in (return) requests from Spain, Japan, France, and India. For example, there was spike in average user return time seen in Japan on desktop around 2018-08-10 to all Wikipedia projects.
I updated the time series graphs of the average next return time within 31 days, using all the available data (from December 2016 through August 2018). See T184677#4432372 for original time series graphs.
Oct 8 2018
Oct 4 2018
Sep 18 2018
I finished creating time-series graphs looking at the users avg next return time (within 31 days and 7 days) for a variety of countries and projects from December 2016 through July 2018.
Aug 28 2018
I plotted histograms showing the frequency of next return data (frequency among unique devices seen on a particular date and returning within 31 days) on all Wikipedia projects for a few days around identified spikes in Indonesia and Bangladesh.
Jul 26 2018
I reviewed pageview trends for previous years to determine any seasonality effects that might account for some of the upward pageview trends on mobile seen in late April and May 2018. Below are the graphs for year over year pageviews on mobile (web + apps) between the months of April and July for Indonesia, Japan, India, and Bangladesh.
Jul 20 2018
Jul 19 2018
Jul 18 2018
Here are the results looking at average user return time within 31 days, based on the last-access data recorded in tbayer.webrequest_extract_bak.
Jun 28 2018
Jun 13 2018
Thanks @Nuria! Yes, that makes sense.
May 30 2018
Here are the preliminary results for the daily pageviews and unique device metrics.
May 1 2018
Apr 27 2018
Here are the updated daily facebook referred pageviews based on data through April 25th.
Apr 24 2018
Apr 23 2018
Apr 19 2018
Apr 18 2018
I took a look at the top facebook referred pages on English Wikipedia before and after the rollout date.
Apr 11 2018
Initial results of the daily number of enwiki pageviews in the US with a referrer from Facebook.
Apr 9 2018
Apr 5 2018
Here’s an estimate of the ctr on Commons compared to English Wikipedia found by joining the webrequest and CirrusSearchRequestSet (Thanks @chelsyx for all the help and suggestions!). This ctr includes both clicks to open pages and clicks on thumbnails to open media viewer.
Mar 22 2018
Mar 9 2018
Mar 2 2018
Here is a plot of the search-wise and session-wise CTR from Nov 2017 to now on Commons. It shows a sudden drop on Dec 14th which looks in line with @EBernhardson 's comment re the deploy date of the change to include multimedia files as part of default search on commons. @EBernhardson Let me know if you have any additional thoughts. I'll work with @chelsyx to investigate further.
Feb 28 2018
Thanks for the review @chelsyx! Revised analysis: https://github.com/MeganNeisler/SDoC-Baseline-Metrics-Redux/tree/master/T187827
Feb 27 2018
Feb 26 2018
Reproduced. See initial codebase and output: https://github.com/MeganNeisler/SDoC-Baseline-Metrics-Redux/tree/master/T187827. Any feedback and suggestions are welcome!
Feb 21 2018
Thanks @mpopov and @chelsyx for the review and feedback! See current updated codebase and output: https://github.com/MeganNeisler/SDoC-Baseline-Metrics-Redux/tree/master/T186575.