Page MenuHomePhabricator

Fix slow Watchlist load and interaction times
Closed, ResolvedPublic

Description

A user reports very slow load times for Watchlist. The user has a large Watchlist of over 5500 pages, but his load time with filters of 27 seconds seems too long. Trizek, for example, has 4000 pages on his Watchlist, and it loads in just over 3 seconds. The user has provided lots of detail about his filters and configuration. (I've asked him to try the load in safemode and report back.)

One reason that I'd like to look into this is that I've noticed that times for the new filters MAY have gotten slower over the last 6 months. See the stats below, which compare times averaged over a week from recently and six months prior (you can click on the dates to check out for yourself).

WhenRC Page avg readyWatchlist avg ready
May 20-262.9 secs3.88 secs
Nov 26-Dec 22.5 secs3.4 secs

Summary of issues

These are sourced from Plans to graduate the New Filters on Watchlist out of beta on Talk:Edit Review Improvements/New filters for edit review and Slow and unresponsive on Talk:Edit Review Improvements/New filters for edit review. Thanks also to Amorymeltzer who generously took the time to generate a JS profile and shared it with us.

The highlights:

  1. Some users, but not all, are unable to click on any links on the page while the RC Filters JS loads. This is probably a CSS issue but we need to investigate further.
  2. All the bug reports are dealing with larger watchlists and also larger limit properties when viewing their watchlists. All the users who submitted feedback are using limit=500 or limit=1000.
    • Bringing the limit down to 100 items seems to bring users back into the range of acceptable performance, but these users are not satisfied with limit=100, they need to be able to work with 500-1000 items at a time.
  3. The bugs have been reported for the watchlist but at least one user sees similar performance problems on Special:RecentChanges
  4. The initial request time to get data from the server for the watchlist is slow, it takes at least 4-5 seconds per page request. That means at a minimum each individual filter toggle will take 4-5 seconds.
    • Testing locally, on a watchlist with 25 items, I have a DOMContentLoaded time of 4.66 seconds, and page load at 30.71s. Toggling filters is between 0.6 and 1.5 seconds. Testing with another account locally with a watchlist of 2500 items, and setting limit to 1000, DOMContentLoaded is 42.77 seconds, and page load is 55.53. Toggling filters on a watchlist of this size takes about 12-13 seconds.
    • On the watchlist with 2500 items, if I set the limit to 25, the DOMContentLoaded time is 4.73s and the page load is 10.02s, toggling filters takes 2-3 seconds.
  5. The initial click on the filters box requires an additional call to load the JS/CSS needed to show the expanded filters interface. In addition to taking a few seconds, this seems to also block user interaction with the page.
    • We could consider loading this data asynchronously once the initial watchlist has loaded. It’s not a lot of data to add to the page by default.
  6. Once the filter UI is open, looking at the JS profiler for toggling the checkboxes on the filters, setVisibleItems() (in mw.rcfilters.dm.FilterGroup.js) and setupHighlightContainers() (in mw.rcfilters.ui.ChangesListWrapperWidget.js) pop up as slower functions. We need to do some more investigation of the JS flamegraph to see what other functions are slow and what if anything we can optimize.
  7. Our bug report sample size is pretty small. We have ~70k users with this feature enabled but only have a handful of bug reports. It's possible that this issue is more widespread and we haven't seen the feedback, or it's possible that it's limited to users with both larger watchlists and larger limit properties set in their query.

Send us your performance traces

If you'd like to help us troubleshoot performance issues on your Watchlist, please copy the text below and paste it into a new comment on this task, and fill in each section.

  • URL: e.g. https://en.wikipedia.org/wiki/Special:Watchlist?hidemyself=1&hidebots=1&hidecategorization=1&hideWikibase=1&limit=500&days=7&enhanced=1&damaging__likelybad_color=c4&damaging__verylikelybad_color=c5&goodfaith__maybebad_color=c3&urlversion=2&safemode=1
  • Browser and version: (e.g. Firefox 60)
  • OS: e.g. Ubuntu 18.04
  • CPU: e.g. i5 dual-core
  • RAM: e.g. 8 GB
  • Number of items in watchlist: e.g. 5,000
  • Observations: Please include any comments or observations you have.

Performance trace:

Please ensure that &safemode=1 is added to your URL before doing the performance trace.

Chrome(/ium) performance trace instructions

  • Go to the watchlist
  • Right click anywhere on the page and choose "Inspect". A panel will open on the right-hand side or bottom of the screen.
  • In the inspector menu bar, click the three dots icon at right, select settings, make sure “Disable cache (while DevTools is open)” is selected.ma
  • Along the top line of this panel, you'll see Elements, Console, Sources, etc. Choose Performance (if the panel is on the side, you may need to click the >> button to get there)
  • In the bar below "Performance", you'll see a record button and a reload button. Click the reload button ("Start profiling and reload page").
  • A recording will start and the page will reload. When it's completely finished loading, click the red "record" button or the blue "stop" button to stop the recording.
  • In the bar where you found the record and reload buttons, click the down arrow button ("save profile")
  • This opens a dialog that will let you save your recording as a .json file.
  • Do not upload this file to Phabricator. Please email it to kharlan [at] wikimedia.org and put T197168 in the subject

Firefox instructions

  • Go to the watchlist
  • Right click anywhere on the page and choose "Inspect element". A panel will open on the bottom of the screen.
  • Along the top line of this panel, you'll see Inspector, Console, Debugger, etc. Choose Performance.
  • Click the "Start recording performance" in the middle, then reload the page. Once the page finishes loading, click "Stop recording performance".
  • On the left, you will see a bar with "Recording #1", and a small "save" link next to it. Click the "save" link.
  • This opens a dialog that will let you save your recording as a .json file.
  • Do not upload this file to Phabricator. Please email it to kharlan [at] wikimedia.org and put T197168 in the subject

Event Timeline

Restricted Application added a project: Collaboration-Team-Triage. · View Herald TranscriptJun 13 2018, 6:33 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Trizek removed a subscriber: Trizek.Jun 13 2018, 6:35 PM
Iniquity added a subscriber: Iniquity.
kostajh claimed this task.Jun 20 2018, 5:49 PM
kostajh updated the task description. (Show Details)Jun 25 2018, 1:51 PM
kostajh updated the task description. (Show Details)
kostajh added a subscriber: Trizek.
Trizek-WMF removed a subscriber: Trizek.Jun 25 2018, 1:53 PM
kostajh updated the task description. (Show Details)Jun 25 2018, 5:01 PM
kostajh updated the task description. (Show Details)Jun 25 2018, 5:06 PM

Another observation: some components of this slowness are specific to highlighting being enabled, or enhanced mode (group by page) being enabled.

kostajh renamed this task from Investigate report of slow Watchlist load times to Fix slow Watchlist load and interaction times.Jun 25 2018, 7:54 PM

Change 441994 had a related patch set uploaded (by Mooeypoo; owner: Mooeypoo):
[mediawiki/core@master] [wip] Move building highlight divs to backend

https://gerrit.wikimedia.org/r/441994

@kostajh, have you defined a test protocol? I can share it with users who have an big watchlist when done.

@Trizek-WMF I haven't yet but will let you know when that's ready.

@jmatazzoni would you like to weigh in on which set of filters and actions we should use for profiling? The filters that have been reported to us so far are:

  • ?hidemyself=1&hidebots=1&hidecategorization=1&hideWikibase=1&limit=500&days=7&enhanced=1&damaging__verylikelybad_color=c5&urlversion=2
  • ?hidebots=1&hidecategorization=1&hideWikibase=1&limit=500&days=7&urlversion=2&safemode=1
  • ?hidebots=1&hidecategorization=1&hideWikibase=1&limit=1000&days=7&enhanced=1&urlversion=2
  • ?damaging=verylikelybad&hidebots=1&hidecategorization=1&hideWikibase=1&limit=1000&days=7&enhanced=1&urlversion=2
  • ?hidebots=1&hideWikibase=1&limit=1000&days=7&damaging__likelybad_color=c4&damaging__verylikelybad_color=c5&highlight=1&urlversion=2
  • ?hidebots=1&hideWikibase=1&limit=1000&days=7&highlight=1&urlversion=2

I could pick from the above when defining a test protocol but we might also want to have a more complex case, like:

  • ?damaging=likelybad%3Bverylikelybad&hidebots=1&hideWikibase=1&limit=1000&days=7&enhanced=1&damaging__likelybad_color=c4&damaging__verylikelybad_color=c5&goodfaith__likelygood_color=c2&userExpLevel__registered_color=c3&highlight=1&urlversion=2

@kostajh, have you defined a test protocol? I can share it with users who have an big watchlist when done.

Now that we have the ability to create our own large test accounts easily (by copying from https://en.wikipedia.org/wiki/User:Ahecht/watchlist ) , is it a good use of tie to work through users like this?

Thanks for asking kosta. I think I'd go with one ORES version and one not. For the ORES version, let's include a User Intent filter, since Elena has a theory that that in particular causes a slowdown.

For the test suite, I'm thinking 500 results should be the standard, since it will show up the differences better. But 250 is the default, so I'm not positive which is more relevant.

How do these look?

ORES version

https://en.wikipedia.org/wiki/Special:Watchlist?hidemyself=1&hidebots=1&hidecategorization=1&hideWikibase=1&limit=500&days=7&enhanced=1&damaging__likelybad_color=c4&damaging__verylikelybad_color=c5&goodfaith__maybebad_color=c3&urlversion=2

Non-ORES version

https://en.wikipedia.org/wiki/Special:Watchlist?hidemyself=1&hidebots=1&hidepreviousrevisions=1&hidecategorization=1&hideWikibase=1&limit=500&days=7&enhanced=1&urlversion=2

Change 442227 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] RCFilters: Move aggregation of highlight classes to the backend

https://gerrit.wikimedia.org/r/442227

Change 441994 merged by jenkins-bot:
[mediawiki/core@master] Move construction of highlight divs to backend

https://gerrit.wikimedia.org/r/441994

Izno added a subscriber: Izno.EditedJun 27 2018, 5:51 PM

I have a watchlist of a mere 480 pages (many of which are high volume e.g. en:WP:AN, en:WP:VP) and I see the issues described in the ticket. Firefox seems to have a slightly larger issue than Chrome. (Both Quantum [57+ though I'm on 60.x] and pre-Quantum Firefox have this issue, though Firefox Quantum brought load times down from 10+ seconds into the 10s realm.) I have my changes set to 1k and days set to 3 (1k changes is always reached first).

Probably the biggest problem is that the loading the filters part seems to block interaction with the links on the page. That is one of the major blockers IMO of release to the wider population.

I happen to use enhanced WL as well (and I would guess so do most power users), if that's of interest. (Incidentally?, The uncollapse arrows also wait for the filter Javascript to load--I think it would probably be better for the arrows and associated Javascript to finish loading first.)

@Izno thanks very much for your feedback. I'll be posting an update here soon with profiling steps we'd like people to take, so we can have a common baseline for measuring before/after as we work on the code.

Change 442227 merged by jenkins-bot:
[mediawiki/core@master] RCFilters: Move aggregation of highlight classes to the backend

https://gerrit.wikimedia.org/r/442227

Change 442746 had a related patch set uploaded (by Catrope; owner: Mooeypoo):
[mediawiki/core@wmf/1.32.0-wmf.10] Move construction of highlight divs to backend

https://gerrit.wikimedia.org/r/442746

Change 442747 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@wmf/1.32.0-wmf.10] RCFilters: Move aggregation of highlight classes to the backend

https://gerrit.wikimedia.org/r/442747

Mentioned in SAL (#wikimedia-operations) [2018-06-28T00:32:49Z] <catrope@deploy1001> Synchronized php-1.32.0-wmf.10/includes: Watchlist perf patches for SWAT, part 1 (T197168, T198140, T198142) (duration: 01m 13s)

Mentioned in SAL (#wikimedia-operations) [2018-06-28T00:33:57Z] <catrope@deploy1001> Synchronized php-1.32.0-wmf.10/resources: Watchlist perf patches for SWAT, part 2 (T197168, T198140, T198142) (duration: 00m 57s)

kostajh updated the task description. (Show Details)Jun 28 2018, 7:47 PM
kostajh updated the task description. (Show Details)
Vvjjkkii renamed this task from Fix slow Watchlist load and interaction times to f2aaaaaaaa.Jul 1 2018, 1:04 AM
Vvjjkkii removed kostajh as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii edited subscribers, added: kostajh; removed: gerritbot, Aklapper.
JJMC89 renamed this task from f2aaaaaaaa to Fix slow Watchlist load and interaction times.Jul 1 2018, 4:08 AM
JJMC89 assigned this task to kostajh.
JJMC89 raised the priority of this task from High to Needs Triage.
JJMC89 updated the task description. (Show Details)

I have a watchlist of a mere 480 pages (many of which are high volume e.g. en:WP:AN, en:WP:VP) and I see the issues described in the ticket. Firefox seems to have a slightly larger issue than Chrome. (Both Quantum [57+ though I'm on 60.x] and pre-Quantum Firefox have this issue, though Firefox Quantum brought load times down from 10+ seconds into the 10s realm.) I have my changes set to 1k and days set to 3 (1k changes is always reached first).
Probably the biggest problem is that the loading the filters part seems to block interaction with the links on the page. That is one of the major blockers IMO of release to the wider population.
I happen to use enhanced WL as well (and I would guess so do most power users), if that's of interest. (Incidentally?, The uncollapse arrows also wait for the filter Javascript to load--I think it would probably be better for the arrows and associated Javascript to finish loading first.)

@Izno we've deployed a set of fixes to address performance issues, if you have time we'd appreciate your feedback. Please look at "Send us your performance traces" in this task description for instructions on sending us your feedback. We are still investigating a few more optimizations but hope you'll see an improvement with the code we've deployed already.

Ahecht added a subscriber: Ahecht.Jul 2 2018, 3:46 PM

I can't do a performance trace on the machine where I was seeing performance issues due to the way the group policy is set up, but I can say that the performance is much improved.

  • URL: ehttps://en.wikipedia.org/wiki/Special:Watchlist?hidemyself=1&hidebots=1&hidecategorization=1&hideWikibase=1&limit=1000&days=7&enhanced=1&damaging__verylikelybad_color=c5&urlversion=2&safemode=1
  • Browser and version: Internet Explorer 11.4
  • OS: Windows 10
  • CPU: i7 dual-core
  • RAM: 16 GB
  • Number of items in watchlist: 8,693
  • Observations: Page loads in ~9 seconds. Watchlist appears after 6 seconds, and the three dots take an additional 3 seconds to go away

Just to add, I likewise see significant improvements, thanks everyone! For the below URL and just over 6,200 pages, I'm seeing only maybe a second or two longer time than the old system. That's something like a 10-15 second improvement — Awesome!

URL: https://en.wikipedia.org/wiki/Special:Watchlist?hidebots=1&hidecategorization=1&hideWikibase=1&limit=1000&days=7&enhanced=1&urlversion=2&safemode=1

If you're still interested in more traces, etc., I'd be happy to provide.

kostajh removed kostajh as the assignee of this task.Jul 3 2018, 3:54 PM

Thank you for the feedback @Amorymeltzer and @Ahecht!

@Catrope I'm unassigning myself from this, but let me know if you want me to look into anything else here.

Catrope closed this task as Resolved.Jul 5 2018, 4:35 PM

I think we're done here, at least for the time being

Restricted Application added a project: Growth-Team. · View Herald TranscriptJul 17 2018, 1:47 PM