Page MenuHomePhabricator

Metrics: Get numbers for New Filters usage (and compare to baseline)
Closed, ResolvedPublic

Description

To measure New Filters beta tool usage, we will pull basline figures from before the beta release and compare them to recent stats among beta users.

What we want to know

  • Tool usage profile: Are people using more tools per session or per capita than before? What filters and other tools (e.g., highlighting) do reviewers use most often? (I suppose we could simply rank the tools selected, most to least popular?) How does this compare to pre-beta?
  • Highlighting usage: What are the top qualities that people choose to highlight? What proportion of beta users (sessions?) employ highlighting?
  • Page popularity (sessions): Can we establish a valid metric for and then a baseline stat for something we might call "page sessions." (E.g., a period of RC page and other page use that could be said to be terminated if the user does not return to the RC page for 30 mins.) [ROAN HAS AN IDEA FOR HOW TO DO THIS CRUDELY]
  • Tool engagement: What proportion of "page sessions" use only default settings vs .those that involve tool selections? (If this goes up with the new system, we can conclude the interface has made the tools more accessible). [ROAN HAS AN IDEA FOR HOW TO DO THIS CRUDELY]
  • Session length: Another traditional measure of engagement is length of session, the theory being that if users like the tool they will use it longer.
  • How many distinct users use the RC Page?

Research parameters

  • What wikis should we study: En.wiki, Czech, he.wiki, de.wiki, fr.wiki
  • What period: 2 weeks, to avoid what look like day-of-week fluctuations.
  • When: before the beta release and as recently as practical.
  • Who: On the theory that beta users are a more advanced group, it would be ideal if we could study before and after results for the same group of people.

Event Timeline

Questions/Issues

  • How can the tool usage numbers best be expressed? One nice way might be in the % of total tool "settings" a given setting represents. So, if filters were selected 100K times in a week, and if Newcomer was selected 5K times, then its usage would be 5%. Conceptually, we're talking about a pie chart.
  • I'm suggesting "page sessions" instead of page views as a gross measure of page popularity for a number of reasons. E.g., page views might actually go down after beta release if users are finding what they need more easily.
    • What is currently being counted as a "page view" now anyway? E.g., is it every time the page reloads? Or every time a user chooses a new tool? etc.
  • Do we need to produce the baseline figures now for all wikis we will ever want to measure? Or will the data be available indefinitely?
  • What is currently being counted as a "page view" now anyway? E.g., is it every time the page reloads? Or every time a user chooses a new tool? etc.

If selecting a tool represents a page view we need to consider that some filters may be counted multiple times for the same final user intent. For example, if a user wants to filter by X, Y and Z, and activates the filters in that order this will result in one view for X, another for X+Y and another for X+Y+Z. This may result in a total of 3 views for X, 2 views for Y and 1 view for Z; and we may reach the conclusion that X is a much popular filter than Z.

I guess it is hard to distinguish which visits are part of selecting a set of filters, and which ones are used to do actual review work. Maybe we can just accept such noise in the data, or try to combine the numbers with tool engagement numbers or other de-duplication strategy depending on how much impact we think that noise could have.

In T170214#3424124, @Pginer-WMF wrote:

If selecting a tool represents a page view we need to consider that some filters may be counted multiple times for the same final user intent. For example, if a user wants to filter by X, Y and Z, and activates the filters in that order this will result in one view for X, another for X+Y and another for X+Y+Z. This may result in a total of 3 views for X, 2 views for Y and 1 view for Z; and we may reach the conclusion that X is a much popular filter than Z.

@Catrope is what Pau says here correct, in terms of the way the filters will be counted? That would throw off the whole exercise pretty significantly.

@Catrope is what Pau says here correct, in terms of the way the filters will be counted? That would throw off the whole exercise pretty significantly.

Yes, he is correct. We could mitigate that a bit by counting how many sessions a filter was used in.

In T170214#3428362, @Catrope wrote:

Yes, he is correct. We could mitigate that a bit by counting how many sessions a filter was used in.

Hmmm. So if I understand this, what you're saying is that without correcting as you suggest, we'd expect that filters at the top of the menu would always be more "popular," since they will be selected first and then counted again and again. Is that right? That's no good at all.

As to your suggested fix: when you say "sessions," are you talking about the hourly hack we discussed? I.e., where a period of RC Page use by an individual within a given clock hour counts as a "session". If so, I'm imagining that what we'd get from correcting with this method would be something like this:

  • there were 1M "sessions" in the two-week period.
  • Filters A, B, C were used in 90% of sessions.
  • Filters D, E, F were used in 80% of sessions
  • Etc.

Would it be like that? If so, what we're losing I guess is that if the reviewer used filter A in one search during a session and filter B in 100 searches, we wouldn't know. We'd count them the same. Is that right?

I think those results would still be useful. We can hypothesize that it all equals out. What do others think?

This is a bit of a radical suggestion, but we could try to do this part in the front end, and depend on sending the data on closing the menu.

So a user selecting A+B+C will likely open the filter popup, select A, then B, then C (without closing it) then close the popup -- and at that point, we'll send the data for A, B and C (and we can even send another schema for the combination?)

The only side effect is that users may close the popup and reopen multiple times (we can try to mitigate that by only resending the data if there was an actual change first) and the other problem is that users may close the popup to change view (not clicking on the view button, etc) in which case we will report back A+B+C (filters) and then again A+B+C+namespaceA+namsepaceB the next time the user closes the menu.

Does that make sense?

Highlight usage

This one is easy, because there's so little data, and it doesn't suffer from the double selection issue. We started collecting this data on 2017-06-14; between then and the present, only 75 distinct users have used the highlighting feature, and highlights were set a grand total of only 506 times. By far the most popular filters to highlight by are the ORES ones, as well as newcomers and unregistered users.

Because there is so little data, I didn't bother to slice it by wiki, by week, by session (user+hour) or anything like that, so the numbers below are highlight events (i.e. the number of times someone set a highlight for a given property). The wikis where highlighting was used the most were the Portuguese (166 events) and English Wikipedias (105), by far (#3 is Farsi with 29). I also included histograms of which days had more or fewer highlight events and unique highlight users at the bottom of the data below.

1-- In the list below, ignore the "hide" prefx: e.g. "hideanons" means highlighting anonymous users,
2-- "hidenewpages" means highlighting new pages, etc.
3mysql:research@s3-analytics-slave [log]> select SUBSTRING_INDEX(event_filters, '"name":', -1) as filter, count(*) as c from ChangesListHighlights_16484288 where event_action='set' group by filter order by c desc;
4+------------------------------------------+----+
5| filter | c |
6+------------------------------------------+----+
7| "damaging__verylikelybad"}] | 60 |
8| "damaging__likelybad"}] | 56 |
9| "damaging__maybebad"}] | 48 |
10| "goodfaith__verylikelybad"}] | 45 |
11| "goodfaith__likelybad"}] | 43 |
12| "userExpLevel__newcomer"}] | 38 |
13| "goodfaith__maybebad"}] | 31 |
14| "registration__hideanons"}] | 25 | -- Anonymous users (Unregistered)
15| "reviewStatus__hideunpatrolled"}] | 19 |
16| "changeType__hidenewpages"}] | 18 |
17| "userExpLevel__experienced"}] | 15 |
18| "userExpLevel__learner"}] | 14 |
19| "changeType__hidelog"}] | 12 |
20| "registration__hideliu"}] | 11 | -- Logged-in users (Registered)
21| "changeType__hideWikibase"}] | 11 |
22| "damaging__likelygood"}] | 8 |
23| "automated__hidebots"}] | 7 |
24| "reviewStatus__hidepatrolled"}] | 6 |
25| "lastRevision__hidelastrevision"}] | 6 |
26| "watchlist__watched"}] | 6 |
27| "goodfaith__likelygood"}] | 5 |
28| "changeType__hidepageedits"}] | 4 |
29| "authorship__hidemyself"}] | 3 |
30| "changeType__hidecategorization"}] | 3 |
31| "significance__hideminor"}] | 3 |
32| "authorship__hidebyothers"}] | 2 |
33| "lastRevision__hidepreviousrevisions"}] | 2 |
34| "watchlist__watchednew"}] | 2 |
35| "automated__hidehumans"}] | 1 |
36| "watchlist__notwatched"}] | 1 |
37| "significance__hidemajor"}] | 1 |
38+------------------------------------------+----+
3931 rows in set (0.00 sec)
40
41-- How many distinct users used highlighting
42mysql:research@s3-analytics-slave [log]> select count(distinct wiki, event_userId) from ChangesListHighlights_16484288 where event_action='set';
43+------------------------------------+
44| count(distinct wiki, event_userId) |
45+------------------------------------+
46| 75 |
47+------------------------------------+
48
49mysql:research@s3-analytics-slave [log]> select event_action, count(*) from ChangesListHighlights_16484288 group by event_action;
50+--------------+----------+
51| event_action | count(*) |
52+--------------+----------+
53| clear | 102 |
54| clearAll | 124 |
55| set | 506 |
56+--------------+----------+
573 rows in set (0.00 sec)
58
59-- Which wikis had the most highlighting events
60mysql:research@s3-analytics-slave [log]> select wiki, count(*) as c from ChangesListHighlights_16484288 where event_action='set' group by wiki order by c desc;
61+---------------+-----+
62| wiki | c |
63+---------------+-----+
64| ptwiki | 166 |
65| enwiki | 105 |
66| fawiki | 29 |
67| wikidatawiki | 20 |
68| mswiki | 15 |
69| cswiki | 13 |
70| itwiki | 12 |
71| elwiki | 12 |
72| testwiki | 12 |
73| fawikisource | 11 |
74| frwiki | 10 |
75| cawiki | 10 |
76| ruwiki | 9 |
77| mediawikiwiki | 7 |
78| dewikibooks | 7 |
79| nlwiki | 7 |
80| fawikibooks | 7 |
81| dawiki | 6 |
82| enwiktionary | 6 |
83| metawiki | 5 |
84| enwikisource | 5 |
85| commonswiki | 4 |
86| eswiki | 4 |
87| fiwiki | 3 |
88| ukwiki | 3 |
89| zhwiki | 3 |
90| jawiki | 3 |
91| dewiki | 3 |
92| hewiki | 2 |
93| tawiki | 2 |
94| arwiki | 1 |
95| azbwiki | 1 |
96| hiwiki | 1 |
97| kowiki | 1 |
98| plwiki | 1 |
99+---------------+-----+
10035 rows in set (0.01 sec)
101
102-- How many highlighting events occurred on each day
103mysql:research@s3-analytics-slave [log]> select SUBSTRING(timestamp,1,8) as day, count(*) as c FROM ChangesListHighlights_16484288 where event_action='set' group by day order by day;
104+----------+----+
105| day | c |
106+----------+----+
107| 20170615 | 31 |
108| 20170616 | 28 |
109| 20170617 | 11 |
110| 20170618 | 25 |
111| 20170619 | 46 |
112| 20170620 | 13 |
113| 20170621 | 19 |
114| 20170622 | 40 |
115| 20170623 | 12 |
116| 20170624 | 5 |
117| 20170625 | 5 |
118| 20170626 | 40 |
119| 20170627 | 9 |
120| 20170628 | 6 |
121| 20170629 | 9 |
122| 20170630 | 29 |
123| 20170701 | 5 |
124| 20170702 | 40 |
125| 20170703 | 23 |
126| 20170704 | 16 |
127| 20170705 | 13 |
128| 20170706 | 16 |
129| 20170707 | 6 |
130| 20170708 | 4 |
131| 20170709 | 4 |
132| 20170710 | 40 |
133| 20170711 | 11 |
134+----------+----+
13527 rows in set (0.00 sec)
136
137-- How many distinct users used highlighting on each day
138mysql:research@s3-analytics-slave [log]> select SUBSTRING(timestamp,1,8) as day, count(distinct event_userId) as c FROM ChangesListHighlights_16484288 where event_action='set' group by day order by day;
139+----------+----+
140| day | c |
141+----------+----+
142| 20170615 | 6 |
143| 20170616 | 4 |
144| 20170617 | 3 |
145| 20170618 | 6 |
146| 20170619 | 7 |
147| 20170620 | 4 |
148| 20170621 | 4 |
149| 20170622 | 5 |
150| 20170623 | 3 |
151| 20170624 | 2 |
152| 20170625 | 1 |
153| 20170626 | 7 |
154| 20170627 | 2 |
155| 20170628 | 1 |
156| 20170629 | 2 |
157| 20170630 | 7 |
158| 20170701 | 3 |
159| 20170702 | 7 |
160| 20170703 | 5 |
161| 20170704 | 3 |
162| 20170705 | 3 |
163| 20170706 | 6 |
164| 20170707 | 3 |
165| 20170708 | 2 |
166| 20170709 | 2 |
167| 20170710 | 12 |
168| 20170711 | 4 |
169+----------+----+
17027 rows in set (0.00 sec)

In T170214#3428591, @Mooeypoo wrote:

This is a bit of a radical suggestion, but we could try to do this part in the front end, and depend on sending the data on closing the menu.

Ingenious! Is there a way to also count Saved Filters with that, which, of course, don't require opening of the panel.

In T170214#3428591, @Mooeypoo wrote:

This is a bit of a radical suggestion, but we could try to do this part in the front end, and depend on sending the data on closing the menu.

Ingenious! Is there a way to also count Saved Filters with that, which, of course, don't require opening of the panel.

We could probably get away with that by putting the logging operation where we tell the system to load saved queries.

Also, just to emphasize, this should only be for this specific count; the front end isn't as robust in keeping data as the backend is, because it may in these cases not log things like "from" / "days" / "limit" and any other data that is not directly implemented in the menus. It's a good method to count this specific requirement, though. We should be able to do that.

Also, I'm suggesting this mainly because I think working with sessions is probably harder but I don't exactly know if that's the case, so ping @Catrope for his opinion.

In T170214#3428597, @Catrope wrote:
75 distinct users have used the highlighting feature, and highlights were set a grand total of only 506 times.

So that's 75 users on all wikis over the last month? That sounds mighty low for such a useful feature. But before we start figuring out how to make highlighting more visible, we'll want some context for these numbers, such as:

  • How many RC Page users who have the beta have been active in that time overall? E.g., what % of users used Highlighting. I assume it's a small percentage, but we don't really know, do we?
  • How many "sessions" with the beta did users engage in during that time?

In T170214#3428362, @Catrope wrote:

Yes, he is correct. We could mitigate that a bit by counting how many sessions a filter was used in.

Hmmm. So if I understand this, what you're saying is that without correcting as you suggest, we'd expect that filters at the top of the menu would always be more "popular," since they will be selected first and then counted again and again. Is that right? That's no good at all.

Yes, probably.

As to your suggested fix: when you say "sessions," are you talking about the hourly hack we discussed? I.e., where a period of RC Page use by an individual within a given clock hour counts as a "session".

Yes.

If so, I'm imagining that what we'd get from correcting with this method would be something like this:

  • there were 1M "sessions" in the two-week period.
  • Filters A, B, C were used in 90% of sessions.
  • Filters D, E, F were used in 80% of sessions
  • Etc.

Would it be like that? If so, what we're losing I guess is that if the reviewer used filter A in one search during a session and filter B in 100 searches, we wouldn't know. We'd count them the same. Is that right?

Yes, that's right.

I think those results would still be useful. We can hypothesize that it all equals out. What do others think?

To make it more meaningful we could use 1-minute sessions for this purpose (or some other convenient but relatively small number).

In T170214#3428591, @Mooeypoo wrote:

This is a bit of a radical suggestion, but we could try to do this part in the front end, and depend on sending the data on closing the menu.

Ingenious! Is there a way to also count Saved Filters with that, which, of course, don't require opening of the panel.

We could probably get away with that by putting the logging operation where we tell the system to load saved queries.

Also, just to emphasize, this should only be for this specific count; the front end isn't as robust in keeping data as the backend is, because it may in these cases not log things like "from" / "days" / "limit" and any other data that is not directly implemented in the menus. It's a good method to count this specific requirement, though. We should be able to do that.

Also, I'm suggesting this mainly because I think working with sessions is probably harder but I don't exactly know if that's the case, so ping @Catrope for his opinion.

Yes, this would be much better and easier. But we can only use that strategy to improve our data collection for the future, not for the past. So it's useful if you want numbers next week/month (which you probably do!) but not if you want numbers right now (which you also do).

In T170214#3428597, @Catrope wrote:
75 distinct users have used the highlighting feature, and highlights were set a grand total of only 506 times.

So that's 75 users on all wikis over the last month? That sounds mighty low for such a useful feature. But before we start figuring out how to make highlighting more visible, we'll want some context for these numbers, such as:

Yeah I was very surprised at how low it was. I looked at the data for traces of loss and didn't find any clear evidence of it. I did just try highlighting something and that hasn't shown up in the data yet, but I'm not sure how much this is supposed to lag by. I'll see if I can find validation error logs somewhere, those would tell us if we've been dropping lots of events somehow.

  • How many RC Page users who have the beta have been active in that time overall? E.g., what % of users used Highlighting. I assume it's a small percentage, but we don't really know, do we?
  • How many "sessions" with the beta did users engage in during that time?

I was going to get those numbers next as part of the "Tool usage profile" question, I just wanted to start with something simple.

Another data point that might be interesting, and that we could explore more (on other wikis) if we like: on enwiki, 217 users have enabled the highlight button at least once. A further 300 never enabled it but did open the dropdown at least once. So that's a total of 517 users who we can consider to have used the beta in the sense that they opened the filter dropdown (its main UI component). Compare that to 8602 users who have the beta feature enabled.

I completely forgot that this was tracked. It's not part of our metrics tracking really, it's stored so that we remember to stop showing the pulsating dot after they've tried it once.

For the people who ended up clicking the highlight button (the group of 217), the number of times they opened the popup before they clicked the highlight button breaks down as follows:

mysql:research@s3-analytics-slave [enwiki]> select count(*), cast(v as int) as num from (select up1.up_value as v from user_properties as up1 where up1.up_property='rcenhancedfilters-seen-highlight-button-counter' and up1.up_value >= 1 and up1.up_user in (select up2.up_user from user_properties as up2 where up2.up_property='rcenhancedfilters-tried-highlight' and up2.up_value=1) ) as x group by num order by num;
+----------+------+
| count(*) | num  |
+----------+------+
|       96 |    1 | -- 96 people clicked it the first time they opened it
|       44 |    2 | -- 44 people clicked it the second time, etc.
|       23 |    3 |
|       13 |    4 |
|        8 |    5 |
|       10 |    6 |
|        1 |    7 |
|        2 |    8 |
|        5 |    9 |
|        1 |   10 |
|        4 |   12 |
|        2 |   13 |
|        1 |   14 |
|        1 |   18 |
|        1 |   36 |
|        1 |  321 |
+----------+------+
16 rows in set (0.03 sec)

(strangely this doesn't add up to 217 but to 213; for 4 users, we did not record any attempts to open the dropdown but did record that they enabled highlighting; perhaps they followed a URL with ?highlight=1 in it)
We show a pulsating dot on the button the 6th, 9th, 12th, etc. time (the spec asked for this to happen the 5th, 8th, 11th, etc time but I think we have an off-by-one error in this code).

For people who never clicked the highlight button (the group of 300), the number of times they opened the popup breaks down as:

mysql:research@s3-analytics-slave [enwiki]> select count(*), cast(v as int) as num from (select up1.up_value as v from user_properties as up1 where up1.up_property='rcenhancedfilters-seen-highlight-button-counter' and up1.up_user not in (select up2.up_user from user_properties as up2 where up2.up_property='rcenhancedfilters-tried-highlight' and up2.up_value=1) ) as x group by num order by num;
+----------+------+
| count(*) | num  |
+----------+------+
|      146 |    1 |
|       63 |    2 |
|       24 |    3 |
|       18 |    4 |
|        6 |    5 |
|        8 |    6 |
|        6 |    7 |
|        5 |    8 |
|        3 |    9 |
|        2 |   10 |
|        2 |   11 |
|        2 |   12 |
|        2 |   13 |
|        1 |   14 |
|        2 |   15 |
|        3 |   17 |
|        2 |   18 |
|        1 |   22 |
|        1 |   46 |
|        1 |   70 |
|        1 |   92 |
|        1 |  113 |
+----------+------+
22 rows in set (0.01 sec)
jmatazzoni renamed this task from Get numbers for New Filters usage (and compare to baseline) to Metrics: Get numbers for New Filters usage (and compare to baseline).Jul 17 2017, 4:52 PM

@jmatazzoni What do you think of the data below? It's only enwiki for now and for the last full two weeks (July 17-30; a slightly different time frame might be better because of changes that happened on the 28th), but if you like it I'll rerun it for all the wikis you asked for.

Caveats: this is the simplest way of data gathering, so the things I'm counting are "events"/"hits", which has a lot of problems as we've previously discussed. Live update (if users find the magic flag, a few have) makes this worse. I'll work on counting "sessions" (i.e. groups of events by the same user in the same ten minutes) next, but the kinds of queries I'd do would be similar, so please look at that below (or at P5832 to view it in a larger area).

1-- Total number events on enwiki between July 17 and 30 inclusive, broken down by beta vs non-beta
2mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' group by event_enhancedFiltersEnabled;
3+------------------------------+----------+
4| event_enhancedFiltersEnabled | count(*) |
5+------------------------------+----------+
6| 0 | 86162 | -- Number of events in non-beta
7| 1 | 33314 | -- Number of events in beta
8+------------------------------+----------+
92 rows in set (6.15 sec)
10
11-- Type of change group:
12
13-- hidepageedits=1 ("Page edits" removed)
14mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hidepageedits=1 group by event_enhancedFiltersEnabled ;
15+------------------------------+----------+
16| event_enhancedFiltersEnabled | count(*) |
17+------------------------------+----------+
18| 1 | 273 |
19+------------------------------+----------+
201 row in set (13.94 sec)
21
22-- hidenewpages=1 ("Page creations" removed)
23mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hidenewpages=1 group by event_enhancedFiltersEnabled ;
24+------------------------------+----------+
25| event_enhancedFiltersEnabled | count(*) |
26+------------------------------+----------+
27| 1 | 2712 |
28+------------------------------+----------+
291 row in set (4.69 sec)
30
31-- hidelog=1 ("Logged actions" removed)
32mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hidelog=1 group by event_enhancedFiltersEnabled ;
33+------------------------------+----------+
34| event_enhancedFiltersEnabled | count(*) |
35+------------------------------+----------+
36| 0 | 6 |
37| 1 | 12630 |
38+------------------------------+----------+
392 rows in set (4.19 sec)
40
41
42-- hidecategorization=0 ("Categorization" shown) even though preference is set to hide categorization
43mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 left join enwiki.user_properties on up_user=event_userId and up_property='hidecategorization' where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hidecategorization=0 and (up_value !='0' or up_value is null) group by event_enhancedFiltersEnabled;
44+------------------------------+----------+
45| event_enhancedFiltersEnabled | count(*) |
46+------------------------------+----------+
47| 0 | 261 |
48| 1 | 1 |
49+------------------------------+----------+
502 rows in set (4.16 sec)
51
52-- hidecategorization=1 ("Categorization" not shown) even though preference is to show categorization
53mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 left join enwiki.user_properties on up_user=event_userId and up_property='hidecategorization' where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hidecategorization=1 and up_value='0' group by event_enhancedFiltersEnabled;
54+------------------------------+----------+
55| event_enhancedFiltersEnabled | count(*) |
56+------------------------------+----------+
57| 1 | 246 |
58+------------------------------+----------+
591 row in set (4.09 sec)
60
61-- hideWikibase=0 ("Wikidata" shown) even though preference is to not show Wikidata
62mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 left join enwiki.user_properties on up_user=event_userId and up_property='rcshowwikidata' where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hideWikibase=0 and (up_value !='1' or up_value is null) group by event_enhancedFiltersEnabled;
63+------------------------------+----------+
64| event_enhancedFiltersEnabled | count(*) |
65+------------------------------+----------+
66| 0 | 6 |
67+------------------------------+----------+
681 row in set (7.66 sec)
69
70-- hideWikibase=1 ("Wikidata" not shown) even though preference is to show Wikidata
71mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 left join enwiki.user_properties on up_user=event_userId and up_property='rcshowwikidata' where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hideWikibase=1 and up_value=1 group by event_enhancedFiltersEnabled;+------------------------------+----------+
72| event_enhancedFiltersEnabled | count(*) |
73+------------------------------+----------+
74| 1 | 69 |
75+------------------------------+----------+
761 row in set (4.38 sec)
77
78
79
80-- Edits by others/self:
81
82-- hidemyself=1 ("Edits by others" only)
83mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hidemyself=1 group by event_enhancedFiltersEnabled ;
84+------------------------------+----------+
85| event_enhancedFiltersEnabled | count(*) |
86+------------------------------+----------+
87| 0 | 61 |
88| 1 | 116 |
89+------------------------------+----------+
902 rows in set (5.22 sec)
91
92-- hidebyothers=1 ("Edits by myself" only)
93mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hidebyothers=1 group by event_enhancedFiltersEnabled ;
94+------------------------------+----------+
95| event_enhancedFiltersEnabled | count(*) |
96+------------------------------+----------+
97| 1 | 2 |
98+------------------------------+----------+
991 row in set (3.72 sec)
100
101-- Major/minor:
102
103-- hideminor=1 ("Non-minor edits" only)
104mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hideminor=1 group by event_enhancedFiltersEnabled ;
105+------------------------------+----------+
106| event_enhancedFiltersEnabled | count(*) |
107+------------------------------+----------+
108| 0 | 67 |
109| 1 | 229 |
110+------------------------------+----------+
1112 rows in set (3.80 sec)
112
113-- hidemajor=1 ("Minor edits" only)
114mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hidemajor=1 group by event_enhancedFiltersEnabled ;
115+------------------------------+----------+
116| event_enhancedFiltersEnabled | count(*) |
117+------------------------------+----------+
118| 1 | 1 |
119+------------------------------+----------+
1201 row in set (4.01 sec)
121
122-- Bots/humans:
123
124-- hidehumans=1 ("Bots" only)
125mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hidehumans=1 group by event_enhancedFiltersEnabled ;
126+------------------------------+----------+
127| event_enhancedFiltersEnabled | count(*) |
128+------------------------------+----------+
129| 1 | 1 |
130+------------------------------+----------+
1311 row in set (3.81 sec)
132
133-- hidebots=0 (both bots and humans; default state is "Humans only")
134mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hidebots=0 group by event_enhancedFiltersEnabled ;
135+------------------------------+----------+
136| event_enhancedFiltersEnabled | count(*) |
137+------------------------------+----------+
138| 0 | 38 |
139| 1 | 2 |
140+------------------------------+----------+
1412 rows in set (7.50 sec)
142
143-- hidepatrolled/hideunpatrolled and hideReviewed are N/A on enwiki
144
145-- User registration:
146
147-- hideliu=1 ("Unregistered users" only)
148mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hideliu=1 group by event_enhancedFiltersEnabled ;
149+------------------------------+----------+
150| event_enhancedFiltersEnabled | count(*) |
151+------------------------------+----------+
152| 0 | 3675 |
153| 1 | 1909 |
154+------------------------------+----------+
1552 rows in set (6.16 sec)
156
157-- hideanon=1 ("Registered users" only)
158mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hideanons=1 group by event_enhancedFiltersEnabled ;
159+------------------------------+----------+
160| event_enhancedFiltersEnabled | count(*) |
161+------------------------------+----------+
162| 0 | 57 |
163| 1 | 30 |
164+------------------------------+----------+
1652 rows in set (7.47 sec)
166
167-- User experience level: (note that registered/unregistered moved into this group on 7/28)
168
169mysql:research@s3-analytics-slave [log]> select event_userExpLevel, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename ='Recentchanges' group by event_userExpLevel ;
170+-----------------------------------+----------+
171| event_userExpLevel | count(*) |
172+-----------------------------------+----------+
173| NULL | 115037 | -- No selection made
174| experienced | 45 |
175| learner | 12 |
176| learner;experienced | 2 |
177| newcomer | 158 |
178| newcomer;learner | 3817 |
179| registered | 6 |
180| registered;experienced | 5 |
181| registered;newcomer | 1 |
182| unregistered | 289 |
183| unregistered;newcomer | 6 |
184| unregistered;newcomer;experienced | 1 |
185| unregistered;newcomer;learner | 96 |
186| unregistered;registered | 1 |
187+-----------------------------------+----------+
18814 rows in set (8.80 sec)
189
190
191-- ORES filters:
192
193-- hidenondamaging=1 (old-style ORES filter)
194mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_hidenondamaging=1 group by event_enhancedFiltersEnabled ;
195+------------------------------+----------+
196| event_enhancedFiltersEnabled | count(*) |
197+------------------------------+----------+
198| 0 | 1291 |
199| 1 | 1 | -- Probably a fluke
200+------------------------------+----------+
2012 rows in set (4.43 sec)
202
203-- Damaging filters:
204mysql:research@s3-analytics-slave [log]> select event_damaging, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename ='Recentchanges' group by event_damaging ;
205+------------------------------------+----------+
206| event_damaging | count(*) |
207+------------------------------------+----------+
208| NULL | 110095 | -- No selection made
209| all | 20 |
210| likelybad | 732 |
211| likelybad;verylikelybad | 7138 |
212| likelygood | 35 |
213| likelygood;likelybad | 1 |
214| likelygood;likelybad;verylikelybad | 2 |
215| likelygood;maybebad | 7 |
216| likelygood;maybebad;likelybad | 4 |
217| likelygood;maybebad;verylikelybad | 2 |
218| likelygood;verylikelybad | 2 |
219| maybebad | 93 |
220| maybebad;likelybad | 20 |
221| maybebad;likelybad;verylikelybad | 189 |
222| maybebad;verylikelybad | 21 |
223| verylikelybad | 1114 |
224| verylikelybad@liveupdate=1 | 1 | -- User error :)
225+------------------------------------+----------+
22617 rows in set (7.38 sec)
227
228-- Good faith filters:
229mysql:research@s3-analytics-slave [log]> select event_goodfaith, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename ='Recentchanges' group by event_goodfaith ;
230+----------------------------------+----------+
231| event_goodfaith | count(*) |
232+----------------------------------+----------+
233| NULL | 111642 | -- No selection made
234| all | 5 |
235| likelybad | 346 |
236| likelybad;verylikelybad | 455 |
237| likelygood | 33 |
238| likelygood;maybebad | 5 |
239| likelygood;maybebad;likelybad | 3 |
240| maybebad | 119 |
241| maybebad;likelybad | 11 |
242| maybebad;likelybad;verylikelybad | 6702 |
243| maybebad;verylikelybad | 23 |
244| verylikelybad | 132 |
245+----------------------------------+----------+
24612 rows in set (7.32 sec)
247
248-- Namespace filter:
249
250-- Number of events where the namespace filter is set to anything at all:
251mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_namespace is not null and event_namespace != '' group by event_enhancedFiltersEnabled ;
252+------------------------------+----------+
253| event_enhancedFiltersEnabled | count(*) |
254+------------------------------+----------+
255| 0 | 1910 |
256| 1 | 4960 |
257+------------------------------+----------+
2582 rows in set (3.70 sec)
259
260-- Number of events for each specific namespace:
261mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, event_namespace, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename ='Recentchanges' group by event_enhancedFiltersEnabled, event_namespace;
262+------------------------------+-----------------+----------+
263| event_enhancedFiltersEnabled | event_namespace | count(*) |
264+------------------------------+-----------------+----------+ -- In non-beta:
265| 0 | NULL | 83238 | -- no selection
266| 0 | 0 | 1014 | -- (Article)
267| 0 | 1 | 514 | -- Talk
268| 0 | 2 | 22 | -- User
269| 0 | 3 | 208 | -- User talk
270| 0 | 4 | 272 | -- Wikipedia
271| 0 | 5 | 2 | -- Wikipedia talk
272| 0 | 6 | 41 | -- File
273| 0 | 7 | 1 | -- File talk
274| 0 | 8 | 48 | -- MediaWiki
275| 0 | 9 | 21 | -- MediaWiki talk
276| 0 | 10 | 8 | -- Template
277| 0 | 11 | 1 | -- Template talk
278| 0 | 12 | 331 | -- Help
279| 0 | 14 | 14 | -- Template
280| 0 | 100 | 330 | -- Portal
281| 0 | 118 | 24 | -- Draft
282| 0 | 711 | 1 | -- TimedText talk
283| 0 | 828 | 60 | -- Module
284| 0 | 829 | 12 | -- Module talk
285| 1 | NULL | 27860 | -- In beta:
286| 1 | 0 | 494 | -- (Article)
287| 1 | 1 | 5 | -- Talk
288| 1 | 2 | 4 | -- User
289| 1 | 3 | 4866 | -- User talk [note: one user is responsible for 4861 of these within 6 hours, probably discovered live update and left it on]
290| 1 | 4 | 7 | -- Wikipedia
291| 1 | 5 | 12 | -- Wikipedia talk
292| 1 | 6 | 9 | -- File
293| 1 | 7 | 1 | -- File talk
294| 1 | 8 | 23 | -- MediaWiki
295| 1 | 10 | 12 | -- Template
296| 1 | 12 | 9 | -- Help
297| 1 | 100 | 9 | -- Portal
298| 1 | 828 | 3 | -- Module
299+------------------------------+-----------------+----------+
30034 rows in set (4.18 sec)
301
302-- Note that no uses of multiple namespace filters appear in the data
303
304-- "Associated namespace" checkbox (disappeared from the beta halfway through):
305mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_associated is not null and event_associated != '' group by event_enhancedFiltersEnabled ;
306+------------------------------+----------+
307| event_enhancedFiltersEnabled | count(*) |
308+------------------------------+----------+
309| 0 | 660 |
310| 1 | 19 |
311+------------------------------+----------+
3122 rows in set (6.36 sec)
313
314-- Invert namespace selection:
315mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_invert is not null and event_invert != '' group by event_enhancedFiltersEnabled ;
316+------------------------------+----------+
317| event_enhancedFiltersEnabled | count(*) |
318+------------------------------+----------+
319| 0 | 3 |
320| 1 | 6 |
321+------------------------------+----------+
3222 rows in set (7.72 sec)
323
324-- Tag filter used:
325
326-- Number of events where tag filter is used at all:
327mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, count(*) from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename='Recentchanges' and event_tagfilter is not null and event_tagfilter != '' group by event_enhancedFiltersEnabled ;
328+------------------------------+----------+
329| event_enhancedFiltersEnabled | count(*) |
330+------------------------------+----------+
331| 0 | 898 |
332| 1 | 1427 |
333+------------------------------+----------+
3342 rows in set (4.12 sec)
335
336-- Top 10 most popular tag filters overall:
337mysql:research@s3-analytics-slave [log]> select event_tagfilter, count(*) as c from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename ='Recentchanges' and event_tagfilter is not null group by event_tagfilter order by c desc limit 10;
338+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+
339| event_tagfilter | c |
340+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+
341| Possible self promotion in userspace | 1084 |
342| mobile edit | 152 |
343| autobiography | 126 |
344| bad external|large unwikified new article|new blank article|Rapid reverts|non-English content|nonsense characters|nowiki added|autobiography|possible libel or vandalism|coi-spam|Possible self promotion in userspace|userspace spam|Possible vandalism|references removed|removal of Category:Living People|possible link spam|repeating characters|reverting anti-vandal bot|Section blanking|shouting|removal of speedy deletion templates|very short new article|wikilinks removed | 110 | -- Probably live update too, 108 of these are in the same hour
345| removal of speedy deletion templates | 84 |
346| possible libel or vandalism | 82 |
347| huggle | 41 |
348| possible link spam | 40 |
349| very short new article | 35 |
350| blanking | 31 |
351+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+
35210 rows in set (3.75 sec)
353
354
355-- Top 20 broken out by beta/non-beta and tag filter:
356mysql:research@s3-analytics-slave [log]> select event_enhancedFiltersEnabled, event_tagfilter, count(*) as c from ChangesListFilters_16837986 where wiki='enwiki' and timestamp between '20170717000000' and '20170730235959' and event_pagename ='Recentchanges' and event_tagfilter is not null group by event_enhancedFiltersEnabled, event_tagfilter order by c desc limit 20;
357+------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----+
358| event_enhancedFiltersEnabled | event_tagfilter | c |
359+------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----+
360| 1 | Possible self promotion in userspace | 913 |
361| 0 | Possible self promotion in userspace | 171 |
362| 0 | autobiography | 122 |
363| 1 | bad external|large unwikified new article|new blank article|Rapid reverts|non-English content|nonsense characters|nowiki added|autobiography|possible libel or vandalism|coi-spam|Possible self promotion in userspace|userspace spam|Possible vandalism|references removed|removal of Category:Living People|possible link spam|repeating characters|reverting anti-vandal bot|Section blanking|shouting|removal of speedy deletion templates|very short new article|wikilinks removed | 110 | -- Probably live update
364| 0 | mobile edit | 80 |
365| 0 | possible libel or vandalism | 76 |
366| 1 | mobile edit | 72 |
367| 0 | removal of speedy deletion templates | 62 |
368| 0 | possible link spam | 39 |
369| 0 | very short new article | 33 |
370| 0 | possible vandalism | 29 |
371| 1 | removal of articles for deletion template | 26 |
372| 0 | coi-spam | 25 |
373| 0 | de-userfying | 25 |
374| 0 | large unwikified new article | 24 |
375| 0 | huggle | 24 |
376| 1 | blanking | 23 |
377| 1 | removal of speedy deletion templates | 22 |
378| 1 | huggle | 17 |
379| 1 | articles for deletion template removed | 17 |
380+------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----+
38120 rows in set (3.76 sec)