Consider this query on logging:
MariaDB [enwiki_p]> SELECT log_user_text AS `reviewer`, COUNT(DISTINCT(log_page)) AS `reviews` FROM logging WHERE log_timestamp BETWEEN 20170519000000 AND 20170619000000 AND log_namespace = 0 AND ( ( log_type = 'patrol' AND log_action = 'patrol' ) OR ( log_type = 'pagetriage-curation' AND log_action = 'reviewed' ) ) GROUP BY reviewer ORDER BY reviews DESC LIMIT 100; +-------------------------+---------+ | reviewer | reviews | +-------------------------+---------+ | MusikAnimal | 54 | | Boleyn | 53 | | BigHaz | 47 | | JTtheOG | 41 | | BigHaz | 41 | | Onel5969 | 40 | | DragonflySixtyseven | 39 | | BigHaz | 37 | | Boleyn | 36 | | BigHaz | 28 | | Fitindia | 28 | | MusikAnimal | 27 | | Boleyn | 27 | | Dschslava | 27 | | MusikAnimal | 27 | | DragonflySixtyseven | 27 | | Boleyn | 27 | | Winged Blades of Godric | 26 | | DragonflySixtyseven | 25 | | BigHaz | 24 | | Boleyn | 23 | | DragonflySixtyseven | 21 | | JTtheOG | 21 | | Onel5969 | 20 | | BigHaz | 20 | | Domdeparis | 19 | | JTtheOG | 19 | | BigHaz | 19 | | JTtheOG | 19 | | Onel5969 | 18 | | Boleyn | 18 | | BigHaz | 17 | | Onel5969 | 17 | | BigHaz | 17 | | Onel5969 | 16 | | Boleyn | 16 | | SL93 | 15 | | Boleyn | 15 | | Markdask | 15 | | Mduvekot | 15 | | JTtheOG | 15 | | Sulaimandaud | 14 | | JTtheOG | 14 | | BigHaz | 14 | | Onel5969 | 14 | | JTtheOG | 14 | | Boleyn | 14 | | Boleyn | 13 | | Fitindia | 13 | | JTtheOG | 13 | | Bensci54 | 13 | | BigHaz | 13 | | Fitindia | 13 | | Stephreef | 13 | | Catmando999 | 13 | | PRehse | 13 | | JTtheOG | 13 | | PRehse | 13 | | Boleyn | 13 | | JTtheOG | 13 | | PRehse | 13 | | Boleyn | 13 | | PRehse | 13 | | Bensci54 | 13 | | PRehse | 13 | | Boleyn | 13 | | MusikAnimal | 13 | | Sadads | 13 | | Dschslava | 13 | | BigHaz | 13 | | Dschslava | 13 | | TheCatalyst31 | 13 | | PRehse | 13 | | JTtheOG | 13 | | Sadads | 13 | | TheCatalyst31 | 13 | | PRehse | 13 | | PRehse | 13 | | PRehse | 13 | | PRehse | 13 | | PRehse | 13 | | BigHaz | 13 | | BigHaz | 13 | | Legacypac | 13 | | Fitindia | 13 | | Markdask | 13 | | PRehse | 13 | | Dschslava | 13 | | JTtheOG | 12 | | Fitindia | 12 | | Stephreef | 12 | | Blythwood | 12 | | Mduvekot | 12 | | BigHaz | 12 | | PRehse | 12 | | Nicnote | 12 | | JTtheOG | 12 | | PRehse | 12 | | Boleyn | 12 | | MrX | 12 | +-------------------------+---------+ 100 rows in set (4 min 11.29 sec)
Note this took only 4 minutes. However the results are painfully wrong. In fact if you remove the LIMIT you'll get some 10,000+ rows when there should be only ~350 rows.
The same query but with logging_userindex:
MariaDB [enwiki_p]> SELECT log_user_text AS `reviewer`, COUNT(DISTINCT(log_page)) AS `reviews` FROM logging_userindex WHERE log_timestamp BETWEEN 20170519000000 AND 20170619000000 AND log_namespace = 0 AND ( ( log_type = 'patrol' AND log_action = 'patrol' ) OR ( log_type = 'pagetriage-curation' AND log_action = 'reviewed' ) ) GROUP BY reviewer ORDER BY reviews DESC LIMIT 100; +-------------------------+---------+ | reviewer | reviews | +-------------------------+---------+ | BigHaz | 1197 | | JTtheOG | 1068 | | Onel5969 | 1062 | | Boleyn | 1045 | | PRehse | 924 | | Sulaimandaud | 523 | | Narutolovehinata5 | 461 | | Usernamekiran | 441 | | Kudpung | 373 | | Domdeparis | 370 | | Robert McClenon | 362 | | Timothyjosephwood | 352 | | MrX | 334 | | Mduvekot | 305 | | Chrissymad | 299 | | Sadads | 295 | | Winged Blades of Godric | 293 | | Atlantic306 | 277 | | Pichpich | 267 | | Fitindia | 261 | | Reddogsix | 251 | | DragonflySixtyseven | 250 | | TonyBallioni | 238 | | Meatsgains | 235 | | MainlyTwelve | 218 | | Reb1981 | 201 | | RileyBugz | 198 | | Stephreef | 185 | | MusikAnimal | 184 | | KGirlTrucker81 | 181 | | 331dot | 175 | | Diannaa | 172 | | Legacypac | 154 | | Ymblanter | 149 | | Nick Moyes | 147 | | C.Fred | 132 | | Graeme Bartlett | 129 | | Jake Brockman | 123 | | Melcous | 121 | | Scope creep | 120 | | Insertcleverphrasehere | 119 | | Gbawden | 116 | | TheCatalyst31 | 115 | | DGG | 115 | | Gtstricky | 113 | | JamesG5 | 108 | | RA0808 | 107 | | Markdask | 103 | | Slashme | 99 | | Simplexity22 | 96 | | Malunrenta | 96 | | WWGB | 90 | | Gilo1969 | 87 | | Diako1971 | 82 | | Dschslava | 81 | | Hydronium Hydroxide | 81 | | Innisfree987 | 80 | | SL93 | 79 | | Jupitus Smart | 79 | | Nicnote | 77 | | Fbdave | 77 | | I dream of horses | 76 | | Oluwa2Chainz | 72 | | Cahk | 70 | | Winner 42 | 70 | | Velella | 66 | | Xevus11 | 64 | | TheMagikCow | 61 | | JHCaufield | 58 | | Ajpolino | 57 | | Blackguard SF | 57 | | Cabayi | 56 | | Yintan | 54 | | WikiDan61 | 54 | | JMHamo | 54 | | Yashovardhan Dhanania | 53 | | L3X1 | 51 | | Killer Moff | 48 | | Ipigott | 48 | | LuK3 | 47 | | Kuyabribri | 45 | | Mattlore | 43 | | Devopam | 42 | | Sjö | 42 | | Numbermaniac | 42 | | Fixer88 | 39 | | TheSandDoctor | 39 | | Vanjagenije | 39 | | Bensci54 | 39 | | TheMagnificentist | 38 | | Oiyarbepsy | 37 | | Fabrictramp | 36 | | Kaldari | 35 | | Mean as custard | 35 | | Catmando999 | 35 | | DESiegel | 35 | | Coderzombie | 35 | | PamD | 34 | | Arthistorian1977 | 33 | | Stikkyy | 33 | +-------------------------+---------+ 100 rows in set (21 min 47.53 sec)
These results are correct and identical to production, but it took nearly 22 minutes to get them.
I tried this same query on logging_logindex but it took too long (45+ minutes), so I can't confirm is the same issue is there.
Why are the results from logging so different than logging_userindex? Is MySQL completely bugging out? I understand the indexed tables on Tool Labs may be missing data where the user or other column is NULL due to being suppressed, but that doesn't explain the discrepancy in data that isn't being suppressed. And 10,000+ rows for this query on logging versus ~350 on logging_userindex is pretty dramatic.