Page MenuHomePhabricator

Gather lists API lstminitem gives wrong results because gl_item_count drifts away from actual item count
Closed, ResolvedPublic

Description

Something is wrong with T97061. These are wrong:

3,4... give correct results though. lstminitems should be translated a straightforward WHERE condition on the counts, no idea what's going on there.

Event Timeline

Tgr raised the priority of this task from to Needs Triage.
Tgr updated the task description. (Show Details)
Tgr added a project: Gather.
Tgr subscribed.

Looks like the itemcount record is incorrect for some lists:

mysql> select gl_id, tracked, actual from (select gl_id, gl_item_count tracked, (select count(*) from gather_list_item where gli_gl_id = gl_id) actual from gather_list)t where actual != tracked;
+-------+---------+--------+
| gl_id | tracked | actual |
+-------+---------+--------+
|     9 |      64 |      1 |
|   150 |       1 |      0 |
|   179 |       1 |      0 |
|   266 |       5 |      4 |
|   340 |       2 |      1 |
|   368 |       2 |      1 |
|   369 |       5 |      3 |
|   371 |       2 |      1 |
|   373 |       6 |      4 |
|   374 |       2 |      1 |
|   375 |       1 |      0 |
+-------+---------+--------+

The count field of the API response is calculated on the fly so that's correct.

The one with the huge discrepancy is owned by Selenium User which means at least the bug is easy to reproduce.

I'll just re-run updateCounts for now and see if https://gerrit.wikimedia.org/r/#/c/217965/ fixes this. (I don't see why it would, but then I don't see why this would happen in the first place, and it certainly makes counting more robust.)

Tgr changed the task status from Open to Stalled.Jun 16 2015, 1:38 AM

Re-ran the script. Let's revisit this in a week or so and see if the counts are still drifting away.

@Tgr so should we promote to production the patch of the list of collections by count or should we wait for the confirmation?

IMO this bug is not important enough to hold back other features. We don't actually claim anywhere on the userinterface to only show lists with 2+ items, we just do it; so even if a few lists with less items get mixed in, it won't look like an error from the user's point of view. But @JKatzWMF is probably a better person to ask about this.

@Tgr @Jhernandez Lets just set the filter higher--maybe 6 (and lower number if necessary) Ensuring that no 0-2 lists going through 99% of the time is important. Also, the more I look at lists, the more I appreciate that ones with more than 4 tend to be much better.

@JKatzWMF the bug is with the API, which we can work around by upping the limit we request (so if we want a minimum of 4 items we request 3). If you want to change the limit we use please raise another bug.

@Tgr - https://en.m.wikipedia.org/wiki/Special:Gather/all/active is showing lots of min items as 0. Do we need to run a script there or is this related to the bug?

Tgr renamed this task from Gather lists API lstminitem off by one to Gather lists API lstminitem gives wrong results because gl_item_count drifts away from actual item count.Jun 19 2015, 6:28 PM

Fixed the counts on production manually. https://gerrit.wikimedia.org/r/#/c/217965/ has been deployed there and on beta it seems to work (gl_item_count used to be wildly wrong for list which was modified a lot by the selenium test, and now it's correct) so I am assuming this fixed.