We count all the unique users that have done any "write" action. For example:
- posted a new topic
- replied to a topic or a post
- edited a topic or a post
- edited a board description
- edited a topic summary
- done any moderation action
We count all the unique users that have done any "write" action. For example:
Change 226524 had a related patch set uploaded (by Sbisson):
Include all contributing users in 'unique-users' query
Change 226524 merged by Milimetric:
Include all contributing users in 'unique-users' query
Two questions:
MariaDB [testwiki]> select distinct rev_change_type from flow_revision; +----------------------+ | rev_change_type | +----------------------+ | reply | | new-post | | create-header | | edit-header | | edit-post | | edit-title | | lock-topic | | create-topic-summary | | restore-topic | | edit-topic-summary | | hide-topic | +----------------------+ 11 rows in set (0.00 sec)
MariaDB [enwiki]> explain SELECT weekstart, SUM(user) as unique_users FROM ( SELECT 1 as user, DATE(DATE_SUB( FROM_UNIXTIME((conv(substring(hex(rev_id),1,12),16,10)>>2)/1000), interval DAYOFWEEK( FROM_UNIXTIME((conv(substring(hex(rev_id),1,12),16,10)>>2)/1000)) day)) as weekstart FROM flow_revision WHERE rev_user_wiki NOT IN ( 'testwiki', 'test2wiki' ) GROUP BY rev_user_wiki, rev_user_id, rev_user_ip ) x WHERE weekstart < DATE_SUB(CURDATE(), interval DAYOFWEEK(NOW()) day) GROUP BY weekstart\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: <derived2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 206376 Extra: Using where; Using temporary; Using filesort *************************** 2. row *************************** id: 2 select_type: DERIVED table: flow_revision type: index possible_keys: NULL key: flow_revision_user key_len: 116 ref: NULL rows: 206376 Extra: Using where; Using index; Using temporary; Using filesort
All below actions are captured? Or the actions that taken into count are different?
No filtering on rev_change_type so all those actions are included plus a few others like delete and suppress. Essentially, everybody whose actions result in a db write is included.
Since there is a derived table, no index is used - is it ok?
I think this is ok. Maybe @EBernhardson or @Milimetric have a different opinion?
No index should be alright, many of the other limn queries (both flow and other projects) tend to be unoptimized as the existing indexes are the ones used in prod, not anything tailored to the actual queries run here.
Checked for writing-to-db events - they are counted correctly.
No results for the week of 2015-04-11?
MariaDB [testwiki]> SELECT weekstart, -> SUM(user) as unique_users -> FROM ( -> SELECT 1 as user, -> DATE(DATE_SUB(FROM_UNIXTIME((conv(substring(hex(rev_id),1,12),16,10)>>2)/1000), interval DAYOFWEEK( FROM_UNIXTIME((conv(substring(hex(rev_id),1,12),16,10)>>2)/1000)) day)) as weekstart -> FROM flow_revision -> WHERE rev_user_wiki IN ( 'testwiki' ) -> GROUP BY rev_user_wiki, rev_user_id, rev_user_ip -> ) x -> WHERE weekstart < DATE_SUB(CURDATE(), interval DAYOFWEEK(NOW()) day) -> GROUP BY weekstart; +------------+--------------+ | weekstart | unique_users | +------------+--------------+ | 2014-05-17 | 1 | | 2015-04-25 | 2 | | 2015-05-02 | 4 | | 2015-05-30 | 4 | | 2015-06-13 | 1 | | 2015-06-20 | 1 | | 2015-06-27 | 1 | | 2015-07-04 | 2 | +------------+--------------+ 8 rows in set (0.01 sec)
Looking at this in the office -- we found that this version was still undercounting unique users. It was showing 2 users on Catalan from July 18-24, when it was actually at least 18 (verified by going to Catalan Flow boards and counting).
Roan figured out some issues -- putting this back to In Development so Roan can continue.
Ran the fixed query on testwiki
MariaDB [testwiki]> SELECT DATE_SUB(d, interval DAYOFWEEK(d)-1 day) AS Week, COUNT(DISTINCT rev_user_id, COALESCE(rev_user_ip, ''), rev_user_wiki) AS "Unique users" FROM (SELECT DATE(FROM_UNIXTIME((conv(substring(hex(rev_id),1,12),16,10)>>2)/1000)) AS d, rev_user_id, rev_user_ip, rev_user_wiki FROM flow_revision WHERE rev_user_wiki IN ('testwiki') ) AS x GROUP BY Week HAVING Week < DATE_SUB(CURDATE(), interval DAYOFWEEK(NOW())-1 day) ORDER BY Week; +------------+--------------+ | Week | Unique users | +------------+--------------+ | 2014-05-18 | 1 | | 2015-04-26 | 2 | | 2015-05-03 | 6 | | 2015-05-10 | 3 | | 2015-05-31 | 5 | | 2015-06-14 | 4 | | 2015-06-21 | 6 | | 2015-06-28 | 1 | | 2015-07-05 | 6 | | 2015-07-12 | 1 | | 2015-07-19 | 4 | +------------+--------------+ 11 rows in set (0.01 sec)
which coincides to the previous test results(shifted by one day):
MariaDB [testwiki]> select DATE(DATE_SUB(FROM_UNIXTIME((conv(substring(hex(rev_id),1,12),16,10)>>2)/1000), interval DAYOFWEEK( FROM_UNIXTIME((conv(substring(hex(rev_id),1,12),16,10)>>2)/1000)) day)) as Day, count(distinct(rev_user_id)) as Users from flow_revision group by Day; +------------+-------+ | Day | Users | +------------+-------+ | 2014-05-17 | 1 | | 2015-04-25 | 2 | | 2015-05-02 | 5 | | 2015-05-09 | 3 | | 2015-05-30 | 5 | | 2015-06-13 | 4 | | 2015-06-20 | 6 | | 2015-06-27 | 1 | | 2015-07-04 | 6 | | 2015-07-11 | 1 | | 2015-07-18 | 4 | | 2015-07-25 | 2 | +------------+-------+ 12 rows in set (0.02 sec)