Page MenuHomePhabricator

[betalabs] Stuck cross-wiki notification
Closed, ResolvedPublic

Description

The issue looks similar to T217522 and T218792.

For user ET13 - Special:Notifications page shows impossible state with zero unread cross-notifications:

The cross-wiki notification (from eswikibooks.beta.wmflabs event_type: login-success) cannot be discarded.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 11 2019, 8:31 PM

Well, this is interesting:

As you say, this is a login-success notification. Web delivery is disabled for these notifications (not in LoginNotify itself, but in WMF config: rOMWC4f3bfc93b785: Config changes for LoginNotify). So it makes sense that a login-success notification that's in the database will not be rendered, and will cause problems.

What doesn't make sense is that this config has been this way since 2017, and your notification is from late 2018. You'd think that if web delivery was disabled, a notification wouldn't be inserted. Since login-success is the only type that is not web-deliverable, maybe Echo's handling of non-web-deliverable notification types is buggy.

BTW @Niharika is it intentional that login-success notifications are not deliverable through the web (only by email), and users cannot choose for them to be web-delivered even if they want to? If it is intentional, I'd recommend this config be moved from CommonSettings.php into the extension.

BTW @Niharika is it intentional that login-success notifications are not deliverable through the web (only by email), and users cannot choose for them to be web-delivered even if they want to? If it is intentional, I'd recommend this config be moved from CommonSettings.php into the extension.

Yes, that's intentional. We don't want an attacker to learn that we detected an unusual login and cause them to change the account password or such. Good point about moving that config to the extension. Filed as T220780: Move config for disabling web login-success notifications to LoginNotify extension.

This bug reminds me of T174220: [Spike 8 hours] Successful login notification gets sent out even when disabled in preferences which we haven't heard about in a while. It was notoriously hard to reproduce. We never really figured out what caused it, I think. It might be true that Echo has problems handling non-web-deliverable notifications.

Restricted Application added a project: Community-Tech. · View Herald TranscriptApr 12 2019, 12:12 AM
Catrope added a comment.EditedApr 12 2019, 11:12 PM

I haven't been able to reproduce this problem, or otherwise figure out how it was possible for a web notification for login-success to be inserted into the database, counted in the notification count, and then become disabled without the notification count being recomputed. In particular I found the following things:

  • Web notifications for login-success are disallowed ($wgNotifyTypeAvailabilityByCategory['login-success']['web'] is false), and have always been as far as I can tell
  • If the user's echo-subscription-web-login-success preference is enabled, a web notification will be inserted despite this category+notifyType combination being disallowed (I wrote a patch that fixes this). This sounds promising, but...
  • Subscription preferences for disallowed category-notifyType pairs (like web-login-success) are forced to false, and have always been as far as I can tell, meaning the user can't enable them
  • If the user somehow did enable this preference, that wouldn't cause this bug, because then the login-success notification would just be visible. So the preference must have become disabled again after the notification was inserted.
  • But any change in a user's settings triggers a notification count recomputation, which would have made the bug go away.
  • Maybe the default value for this preference changed? That could change the effective value of the preference without running the saveSettings hook.
  • But as far as I can tell, the default value of this preference has always been false.

So what we're seeing appears to be impossible: the echo-subscription-web-login-success preference must have been enabled at the time that the notification was stored and at the last time the count was computed; it must have been disabled after that to create the phantom effect; but it also cannot have changed, because then the count would have been recomputed. And it also appears to be impossible for this preference to ever have been enabled anyway.

Despite this solid-sounding argument that inserting login-success notifications is impossible seven different ways to Sunday, we somehow have over 100,000 of them in the database:

mysql:research@dbstore1005.eqiad.wmnet [enwiki]> select count(*), min(notification_timestamp), max(notification_timestamp) from echo_event join echo_notification on notification_event=event_id where event_type='login-success';
+----------+-----------------------------+-----------------------------+
| count(*) | min(notification_timestamp) | max(notification_timestamp) |
+----------+-----------------------------+-----------------------------+
|   102048 | 20170817212705              | 20181004192550              |
+----------+-----------------------------+-----------------------------+
1 row in set (0.76 sec)

There are 102k of these supposedly-impossible notifications. They were being inserted for a little over a year, from August 2017 until October 2018. The phantom notification reported in this task on beta eswikibooks was from September 2018, which also falls within this time window.

It appears that something happened in 1.32.0-wmf.24 that caused us to stop inserting these notifications:

2018-10-03 21:16 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.24
2018-10-04 19:26 dduvall@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.24

I looked at a few large wikis, and on all of them the timestamp of the last login-success notification is right before wmf.24 was enabled on that wiki: 20181004192550 on enwiki, 20181004192150 on dewiki, 20181003210113 on commonswiki and 20181003211300 on wikidatawiki.

The earliest one on enwiki is from 8 minutes after LoginNotify was enabled;

2017-08-17 21:19 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Enable LoginNotify finally! Yippeeeeee https://gerrit.wikimedia.org/r/#/c/372450/ (duration: 00m 45s)

So it looks like we were always inserting login-success notifications into the database, until something changed in 1.32.0-wmf.24 that caused us to stop.

Found it! The change in 1.32.0-wmf.24 that changed the behavior was rECHOb82b54f0a6ef: Don't override checkmatrix defaults set elsewhere, which was a fix for T174220: [Spike 8 hours] Successful login notification gets sent out even when disabled in preferences. Echo was overriding the default values for all echo-subscription-web-* preferences and forcing them to true even if they were configured to be false. So the default value for this preference did change, just not in a way that was apparent from looking at the git history.

This means the users affected by this bug would be users who:

  • Were affected by T174220, i.e. received a login-success web notification even though that was supposed to be disabled. This probably means users who did not change their preferences on the wiki in question before logging in a second time from a different IP.
  • Did not mark this login-success notification as read (possibly because they received it shortly before T174220 was fixed in early October)
  • After early October, did not receive any additional notifications, mark any other notifications as read, or change their preferences on the wiki in question

This suggests that the number of affected users is probably fairly low. Any user experiencing this bug should go each wiki that they're seeing a phantom notification on and either mark a notification as read or unread, or go to the preferences page and save a change to their preferences (doesn't matter which preference).

Change 503541 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/Echo@master] Add maintenance script to recompute notification counts

https://gerrit.wikimedia.org/r/503541

Change 503541 merged by jenkins-bot:
[mediawiki/extensions/Echo@master] Add maintenance script to recompute notification counts

https://gerrit.wikimedia.org/r/503541

Mentioned in SAL (#wikimedia-operations) [2019-05-06T19:47:52Z] <RoanKattouw> Running recomputeNotifCounts.php --notif-types=login-success on all Echo wikis for T220762

I ran the script above to recompute the notification counts for every user who could potentially have been affected by this (users who had at least one unread login-success notification). It took a while to run, but it finished some time yesterday. This should completely resolve the issue: new cases shouldn't happen (unless someone changes a notification type from being available on web to unavailable on web again), and existing cases should now be fixed. Nobody should be seeing wrong counts or phantom notifications anymore.

Etonkovidova closed this task as Resolved.Fri, May 10, 8:54 PM
Etonkovidova claimed this task.