RecentChanges very slow on Wikidata when RCFilters beta is enabled
Closed, ResolvedPublic

Description

I have reports from users who have a very slow loading time (30 seconds or more) for some particular filters combinations.

  • On Wikidata
    • add Unpatrolled to Default filters
    • remove Unpatrolled to Default filters
    • add Newcomers to Default filters
    • remove Newcomers to Default filters

I'll probably add more later.

Restricted Application added a project: Collaboration-Team-Triage. · View Herald TranscriptJul 7 2017, 4:18 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Catrope renamed this task from Some filters or combinaisons of filters are really slow to load to Some filters or combinations of filters are really slow to load.Jul 11 2017, 5:49 PM

It looks like RC is slow on Wikidata no matter what you do, but only if the RCFilters beta feature is enabled.

Catrope renamed this task from Some filters or combinations of filters are really slow to load to RecentChanges very slow on Wikidata when RCFilters beta is enabled.Jul 11 2017, 5:56 PM
Etonkovidova added a comment.EditedJul 11 2017, 7:09 PM

Checked wikidata (wmf.7) - it seems that all filter selection on wikidata takes twice as much time to fetch the results comparing to enwiki (checked for default of 50 results for 7 days).

This is because the ChangeTags::tagUsageStatistics() query takes 33 seconds. The caching for it appears to not be working, or we wouldn't experience this slowness on every single request, but making one req every 5 minutes be slow is also bad.

@Krinkle presciently made me fix this last week in https://gerrit.wikimedia.org/r/#/c/363969/ , which will roll out to Wikidata in tomorrow's train.

I think this is a pretty compelling case to move https://gerrit.wikimedia.org/r/#/c/334337/ along.

This also means that we probably need to drop our call to tagUsageStatistics() in RCFilters for the time being, which would mean we wouldn't be able to sort tags by popularity. I was afraid we would also not be able to list all tags, but since we limit ourselves to active tags anyway that shouldn't be an issue.

In T169997#3428489, @Catrope wrote:

This also means ...we wouldn't be able to sort tags by popularity.

So, a few questions:

  • How, then, should we then order tags? Alphabetically, by Tag Name, I suppose?
  • If we do that, is it possible/acceptable to put tags whose name begins with a number last instead of first? (I just think the order of the list will be more instantly clear if the first entries aren't a bunch of numbers. E.g., on en.wiki, the tag "2017", then "2017 Source Edit."
  • I assume this won't screw up our ability to show only Active tags? (i.e., you don't need the count to know if it's Active, right?)

Change 364632 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] RCFilters: Don't call ChangeTags::tagUsageStatistics() for now

https://gerrit.wikimedia.org/r/364632

In T169997#3428489, @Catrope wrote:

This also means ...we wouldn't be able to sort tags by popularity.

So, a few questions:

  • How, then, should we then order tags? Alphabetically, by Tag Name, I suppose?

In the quick and dirty patch I just submitted, I'm sorting tags alphabetically by internal name, but we could (and probably should) sort by display name instead.

  • If we do that, is it possible/acceptable to put tags whose name begins with a number last instead of first? (I just think the order of the list will be more instantly clear if the first entries aren't a bunch of numbers. E.g., on en.wiki, the tag "2017", then "2017 Source Edit."

Maybe, but not easily. I also don't recommend it. There are only two such tags as far as I know, and we control them (VisualEditor puts them there), so we could ask Dan et al to rename them if it bothers you.

  • I assume this won't screw up our ability to show only Active tags? (i.e., you don't need the count to know if it's Active, right?)

That's right, active tags only. That's actually the reason that this isn't causing us to lose any tags (some tags are only in the hitcount data and nowhere else, but none of those tags are active).

  • How, then, should we then order tags? Alphabetically, by Tag Name, I suppose?

In the quick and dirty patch I just submitted, I'm sorting tags alphabetically by internal name, but we could (and probably should) sort by display name instead.

I've amended my patch to sort alphabetically by display name.

Change 364632 merged by jenkins-bot:
[mediawiki/core@master] RCFilters: Don't call ChangeTags::tagUsageStatistics() for now

https://gerrit.wikimedia.org/r/364632

Jdforrester-WMF added a subscriber: Jdforrester-WMF.

Roan says this is now complete, FWIW.

Checked in wmf.30 - the cases for adding/removing Newcomers to Default filters
are noticeably faster (~3s).

QA Recommendation: Resolve

jmatazzoni closed this task as Resolved.Jul 20 2017, 9:24 PM
Restricted Application added a subscriber: PokestarFan. · View Herald TranscriptAug 4 2017, 8:52 AM