Page MenuHomePhabricator

Echo flyout takes too long to open
Closed, ResolvedPublic

Description

When I click in the link "[1]" which appears when I have a new notification, it takes a long time until anything happens. The flyout (or a spinner, or anything) should appears immediately when the user clicks on this badge. Sometimes it takes more than a second.


Version: master
Severity: major

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:12 AM
bzimport added a project: Notifications.
bzimport set Reference to bz54486.
bzimport added a subscriber: Unknown Object (MLST).

I don't know if any intentional delay has been placed on clicking the link in personal tools. I doubt it, but it's certainly possible.

Unless we absolutely must, we should avoid implementing a spinner. Ideally notifications should just be instant and should just work.

Do you have any means of measuring how long opening the notifications window takes currently?

I looked at my browser console, and it seems that every time you click the icon, it makes an API request to fetch your notifications, which is probably causing the delay.

Also, if you click the icon, have the flyout show up, then hide it again, and click the icon a second time, it makes another API request, causing the same delay.

I wonder if this is one of those situations where it's a choice between a small delay and fully up-to-date data or possibly out-of-date data but immediate (locally cached) data.

Created attachment 13356
Network tab on Google Chrome: the api query took 6.78 seconds

Screenshot taken from
https://pt.wikipedia.org/wiki/Special:BlankPage

Attached:

flyout,_6.78s.png (768×1 px, 125 KB)

Created attachment 13357
pt.wikipedia.org.har

The har file with the data from previous test.

Attached:

Prioritization and scheduling of this bug is tracked on Mingle card https://mingle.corp.wikimedia.org/projects/flow/cards/280

Change 93115 had a related patch set uploaded by Bartosz Dziewoński:
ext.echo.overlay: Show a spinner while waiting for data [WIP]

https://gerrit.wikimedia.org/r/93115

Change 93115 abandoned by Bartosz Dziewoński:
ext.echo.overlay: Show a spinner while waiting for data [WIP]

https://gerrit.wikimedia.org/r/93115

Actually, I don't really have the time nor the incentive to finish working on this, sorry. Anyone can feel free to pick it up.

A new record: 5 seconds until it showed me anything!? A little too much, I think...

bsitu wrote:

Under my profiling with xdebug, most of the time is spent in wikitext parsing, generating pretty timestamp etc. With the tab separation, we are displaying 25 items max per tab which is almost 6 times the number we used to display in the overlay. More records means more time to process, this is expected. However, I haven't never seen a record of 5 seconds, this is way higher than what I see in mediawiki ( with 25 flow notifications + 15 other notifications ). Does this happen at consistent matter?

If this is really a problem, we could find a way to cache the entire notification to skip the parsing.

Wouldn't be possible to show the box immediately with a static content such as "Loading..." and let that text be replaced by the actual content when it is ready? This way the interface would be a little more responsive, and the user would know that there is something being processed instead of being tempted to click in the link again (unnecessarily). Even if it usually takes one second until the whole content is available, when it takes more than that, it is a little confusing.

It just took me 15 seconds for the API request for my flyout to open on meta.

(This bug seems to conflate two issues:

  • The API requests take too long to complete
  • The UI doesn't provide any clue that an API request is in progress

Do about that what you will, just noting.)

bsitu wrote:

Besides the time in wikitext, timestamp parsing, it could be related to this: the flyout is now separated into two tabs: message + alert, data is retrieved via two lookup queries via notification types, if you have massive amount of alert notification and only one flow notification, the lookup has to scan through all notification to get, say like 10 flow notifications. I didn't really expect such cases and user would have tons of notifications. If I have known this, I would have vetoed the tab separation from the very beginning, :)

There are a couple of things we can do, one attempt is to clear old notifications to keep the volume low.

The other possible solution is to denormalize notification table by adding event_type along with indexes.

(In reply to bsitu from comment #15)

Besides the time in wikitext, timestamp parsing, it could be related to
this: the flyout is now separated into two tabs: message + alert, data is
retrieved via two lookup queries via notification types, if you have massive
amount of alert notification and only one flow notification, the lookup has
to scan through all notification to get, say like 10 flow notifications. I
didn't really expect such cases and user would have tons of notifications.
If I have known this, I would have vetoed the tab separation from the very
beginning, :)

On places like meta and enwp where Flow is only on one or two test pages, most active? users will have lots of normal notifications, but very few Flow ones.

As a first step, it seems the appropriate bits of code need to be instrumented and the production profile logs need to be reviewed to see where the time is really spent. My only worry is we might need to let echo go to bigger wiki's for enough profiling data to be collected?

bsitu wrote:

(In reply to Erik Bernhardson from comment #17)

As a first step, it seems the appropriate bits of code need to be
instrumented and the production profile logs need to be reviewed to see
where the time is really spent. My only worry is we might need to let echo
go to bigger wiki's for enough profiling data to be collected?

Yeah, adding the profiling code to the API is the first step. If meta takes 15 seconds to run, I think that should be enough for us to see where the time is spent

(In reply to Kunal Mehta (Legoktm) from comment #13)

It just took me 15 seconds for the API request for my flyout to open on meta.

I just saw it take 15+ seconds on enwikisource too.

hmm, flow isn't even on enwikisource so the problem isn't directly related to the alert/message split. Something else is going on.

Also I just merged https://gerrit.wikimedia.org/r/#/c/156737/ from bsitu which will add a few profiling metrics to the api call which will show up at graphite.wikimedia.org

Any news on this?
It is already annonying now, and T93109: T1. Convert unread LQT notifications into unread Flow notifications will probably make it worse.

Catrope renamed this task from Badge clicks takes too long to give some feedback to the user to Echo flyout takes too long to open.Aug 3 2015, 11:58 PM
Catrope set Security to None.

This seems to have become worse in the past day or so (maybe since async RL?).

Change 230947 had a related patch set uploaded (by Legoktm):
Measure how long it takes for the flyout to open

https://gerrit.wikimedia.org/r/230947

Change 230950 had a related patch set uploaded (by Ori.livneh):
Measure how long it takes for the flyout to open

https://gerrit.wikimedia.org/r/230950

Change 230951 had a related patch set uploaded (by Ori.livneh):
Measure how long it takes for the flyout to open

https://gerrit.wikimedia.org/r/230951

Change 230950 merged by jenkins-bot:
Measure how long it takes for the flyout to open

https://gerrit.wikimedia.org/r/230950

Change 230951 merged by jenkins-bot:
Measure how long it takes for the flyout to open

https://gerrit.wikimedia.org/r/230951

Change 230947 merged by jenkins-bot:
Measure how long it takes for the flyout to open

https://gerrit.wikimedia.org/r/230947

We now have data going into graphite on how long it is taking the flyout to open (under MediaWiki.echo.overlay), I'll take a look again once we have a days worth of data to see how bad this is...

Change 231200 had a related patch set uploaded (by Legoktm):
[wip] Split alerts and messages in Echo

https://gerrit.wikimedia.org/r/231200

Legoktm claimed this task.

With rECHO1ac72cc01ab5: Split alerts and messages in Echo there is now immediate feedback to the user when they click on the badge.