Page MenuHomePhabricator

COUNT on revision is different than revision_userindex
Closed, InvalidPublic

Description

MariaDB [enwiki_p]> SELECT COUNT(rev_id) FROM revision WHERE rev_page = 5149102;
+---------------+
| COUNT(rev_id) |
+---------------+
|        308817 |
+---------------+
1 row in set (2.17 sec)

MariaDB [enwiki_p]> SELECT COUNT(rev_id) FROM revision_userindex WHERE rev_page = 5149102;
+---------------+
| COUNT(rev_id) |
+---------------+
|        308707 |
+---------------+
1 row in set (47.57 sec)

First off, I'm not sure why revision is so much faster than revision_userindex, but I'm OK with that :)

The issue obviously is the COUNT of rev_id's is significantly different. 5149102 by the way is en:Wikipedia:Administrators' noticeboard, which was created in 2004. It is a very busy page, but I ran these queries right after each other, and the COUNT is different by over a hundred.

I ran the same for en:Wikipedia:Administrators' noticeboard/Incidents, also very busy, and again for revision I got 1037300 and for revision_userindex it returned 1036958.

Meanwhile, pages with fewer revisions appear to have the same count for both views. Here's en:Thomas Jefferson, which is considerably older (created in 2001):

MariaDB [enwiki_p]> SELECT COUNT(rev_id) FROM revision_userindex WHERE rev_page = 29922;
+---------------+
| COUNT(rev_id) |
+---------------+
|         16741 |
+---------------+
1 row in set (1.83 sec)

MariaDB [enwiki_p]> SELECT COUNT(rev_id) FROM revision WHERE rev_page = 29922;
+---------------+
| COUNT(rev_id) |
+---------------+
|         16741 |
+---------------+
1 row in set (0.18 sec)

Maybe this is a known issue (I couldn't find an existing task), but either way, which table should I rely on?

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 14 2018, 12:47 AM

I just ran the same query on the analytics replica, and it seems revision is returning the correct value:

mysql:research@analytics-store.eqiad.wmnet [enwiki]> SELECT COUNT(rev_id) FROM revision WHERE rev_page = 5137507;
+---------------+
| COUNT(rev_id) |
+---------------+
|       1037300 |
+---------------+
1 row in set (0.78 sec)

So I guess revision is the one to rely on.

MusikAnimal closed this task as Invalid.Jul 14 2018, 1:15 AM

Found it T68786#706897. I should know this by now :)

Pretty interesting that only a few hundred revisions were suppressed at enwiki's most prolific dramaboards, though...

Sorry for the noise!