Currently, to count edits within a category we're doing something like:
SELECT COUNT(DISTINCT(revs.rev_page)) AS edited, IFNULL(SUM(CASE WHEN revs.rev_parent_id = 0 THEN 1 ELSE 0 END), 0) AS created FROM `enwiki_p`.`revision_userindex` revs INNER JOIN `enwiki_p`.`page` ON page_id = rev_page INNER JOIN `enwiki_p`.`categorylinks` ON cl_from = rev_page WHERE (revs.rev_user_text IN ('MusikAnimal', 'Samwilson')) AND (cl_to IN ('Living_people', 'Art_in_New_York_City')) AND (revs.rev_timestamp BETWEEN 20180101000000 AND 20180131235959) AND page_namespace = 0
This is pretty fast (~0.20 sec in this case) but not so much when you don't filter by user:
SELECT COUNT(DISTINCT(revs.rev_page)) AS edited, IFNULL(SUM(CASE WHEN revs.rev_parent_id = 0 THEN 1 ELSE 0 END), 0) AS created, COUNT(DISTINCT(revs.rev_user_text)) AS editors FROM `enwiki_p`.`revision_userindex` revs INNER JOIN `enwiki_p`.`page` ON page_id = rev_page INNER JOIN `enwiki_p`.`categorylinks` ON cl_from = rev_page WHERE (cl_to IN ('Living_people', 'Art_in_New_York_City')) AND (revs.rev_timestamp BETWEEN 20180101000000 AND 20180131235959) AND page_namespace = 0
This times out entirely (our max is 900 secs). This is mainly because "Living people" is so big. If you do this for only "Art in New York City" it's fairly fast (~1.20 secs). I imagine most organizers won't be using a big category like this, but it's good to prepare for the possibility. Note here we're also counting the number of unique editors. To make matters worse, we, if possible, would also like to discount edits made by bots. That'd mean another JOIN on user_groups. I imagine that can only make the query slower.
Is there a better way construct this query?
Finally, for our so-called "revision browser", we want the same queries above but SELECT each individual revision with basic data like username, page title, and diff size.