Page MenuHomePhabricator

red notification number not clearing
Closed, ResolvedPublic

Description

Author: dlerner42

Description:
The red notification counter number is not clearing after I change pages. Basically, if I click on the counter number, it will reset to zero. But as soon as I go to a different page in Wikipedia, the number go back to what it was before. I've tried logging out and purging, but that has not solved the problem.


Version: unspecified
Severity: normal
OS: Windows Vista
Platform: PC

Details

Reference
bz48568

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 1:42 AM
bzimport set Reference to bz48568.
bzimport created this task.May 17 2013, 3:36 AM

dlerner42 wrote:

By clicking on "All notifications", I was able to reset the counter. But if this is the only way to reset the counter, that is going to be a problem...

Works for me. Can anyone post steps to reliably reproduce this bug? It would also be helpful to know what browser is being used in case it's some sort of JS or client-side caching bug.

TheDJ added a comment.Jul 11 2013, 4:28 PM

TCO sees this and he is using Internet Explorer 10 on Windows 7.

Does seem like a caching issue. We should set an explicit 'fast' (or always?) expire on the the "are there new message" api request. That should return an timestamp or something. This timestamp can be used in the request of the "get my messages request", populate the flyout to avoid that one being cached out of sync, which apparently happens for SandyGeorgia: https://en.wikipedia.org/wiki/Wikipedia_talk:Notifications#Another_example_five_days_later

Apparently IE (unlike Chrome/Firefox) caches all GET requests, even AJAX ones.

There's no API request involved. The data is served as part of the page (same as the old Orange Bar of Death functionality). You can turn off Javascript entirely and it will still work the same. If you're logged in, the cache headers on the page should be "s-maxage=0, max-age=0, must-revalidate". I suppose it's possible IE is caching it anyway, or there is some proxy server between the user and WMF that is caching it, but this wouldn't have been changed by Echo. They would have had the same problem with the Orange Bar of Death. The only way I can reproduce this bug myself is by using the back button, but there's not a lot we can do about that. We could in theory fire an API request on every page load (and even trigger it on using the back button), but that would be a huge increase in API requests. I would only want to do that as a last resort.

I tried reproducing in IE9, but was not able to. If anyone can reliably reproduce, please post the steps to do so.

(In reply to comment #6)

There's no API request involved.

That's not entirely true. The number itself is served with the page, yes, but the "flyout" is done with an API request to fetch the actual messages. And then a second API request is performed to mark the messages as read, based on what was returned by the first request. And I note that both of these are sent as GET requests and don't seem to set any additional caching-related headers on the request, and the responses don't seem to do anything special cache-wise either, meaning it is very likely that some browsers will cache them in at least some configurations. I'm not familiar with IE configuration, but I imagine that adjusting the caching settings to their most aggressive values would allow you to reproduce the bug.

If the browser or proxy responds to the first request from cache, that would certainly cause what SandyGeorgia reports. And the cached data would also cause the "mark read" query to not include the ids that are actually unread, so they wouldn't get marked as read and the number would remain for later pageviews as is reported here. Off the top of my head, one way around that would be to include some data attribute on the notification badge with the id or timestamp of the most recent notification, and use that value as a cache buster on the API request.

Further, once the first request is served from cache, the second will probably be served from cache too since it will have the same value for 'notmarkread' as was sent the first time. While fixing the caching issue in the first would prevent this anyway, I do note that this second query should be a POST rather than a GET since it is intended to change state on the server.

bsitu wrote:

Yes, this does seem like the famous IE Ajax cache problem

Change 73531 had a related patch set uploaded by Bsitu:
(bug 48568) Bust IE browser ajax cache + some API clenaup

https://gerrit.wikimedia.org/r/73531

Change 73531 merged by jenkins-bot:
(bug 48568) Bust IE browser ajax cache + some API clenaup

https://gerrit.wikimedia.org/r/73531

Update today, that SandyGeorgia is experiencing the problem again. Sandy is using Internet Explorer 10.
Describes the problem as: "Old notifications showing, red bar won't go away, can't get links to new posts."
http://en.wikipedia.org/wiki/Wikipedia_talk:Notifications#Almost_two_weeks_later.2C_again

bsitu wrote:

The fix is already in 1.22wmf11 which will be deployed to enwiki on Thursday

(In reply to comment #13)

The fix is already in 1.22wmf11 which will be deployed to enwiki on Thursday

Actually, it looks like it just missed making it into 1.22wmf11. But it'll be in 1.22wmf12.

bsitu wrote:

(In reply to comment #14)

(In reply to comment #13)

The fix is already in 1.22wmf11 which will be deployed to enwiki on Thursday

Actually, it looks like it just missed making it into 1.22wmf11. But it'll be
in 1.22wmf12.

Thanks for the note! In that case, we will try to cherry-pick and deploy this change to enwiki in the lightning window on Thursday

Change 76038 had a related patch set uploaded by Bsitu:
(bug 48568) Bust IE browser ajax cache + some API clenaup

https://gerrit.wikimedia.org/r/76038

Change 76038 merged by jenkins-bot:
(bug 48568) Bust IE browser ajax cache + some API clenaup

https://gerrit.wikimedia.org/r/76038

bsitu wrote:

I will mark this bug as resolved for now. Feel free to re-open this if bug pops up again and I will do further investigation.