Page MenuHomePhabricator

Design the behavior of a notifications Provider in core
Closed, ResolvedPublic

Description

As we design and implement a base notification system in core, we are working with a basic underlying system (email only) for Core itself that can be overridden and expanded with external (advanced) notification behavior through Providers.

In this case, Echo notifications will be the Provider for notification behavior, the interface of creating and sending notifications is simplified through Core, and allows core notifications (Enotif) to be unified in behavior and "engine" behind the management of the notifications.

Since core notification system works with the Provider, and Echo will be the de-facto production provider (keeping behavior as-is for wikimedia production and any installation that utilizes Echo notifications extension) there's a question that was raised regarding what to do if a notification is sent into the Provider that the Provider does not recognize.

Two main options were raised:

  1. Designing a prioritization (fallback) system for the Providers, where notifications go to the primary provider and "fall back" to secondary provider if they are unrecognized.
  2. Designing a default Core provider that is completely overtaken by any other Provider if it is available, which would mean all notifications must go through the primary provider without fallback.

There are pros and cons for either of those options, and conversations so far have leaned heavily towards #2 (no Fallback) for several reasons, mostly architectural simplicity and general expected behavior (having a single provider that provides the entire expected behavior of notification management into the system is a unified expected behavior.)

That said, there are potential implications of behavior, specifically how core notifications (that will utilize the same system) may work with either of those choices. This ticket will try to summarize the conversations around which option to choose, the best reasons behind it, and what would be potential implications for the behaviors that will need to be communicated to development teams and potentially users.

Event Timeline

There is also nit-pick when it comes to naming. A class that extension provides to handle/deliver notifications here is called "Provider". We had similar conversation in the gerrit PR: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1108511/comment/99b9bcb6_b81574d3/

@matmarex implemented Handlers as Echo would provide a mechanism to handle notification. Where Providers sounds more like a extension that would provide notifications, but it may not handle it.

Personally, I'm see those more as "Delivery" mechanism, that you have a notifications and you ask something to deliver those.

There is also nit-pick when it comes to naming. A class that extension provides to handle/deliver notifications here is called "Provider". We had similar conversation in the gerrit PR: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1108511/comment/99b9bcb6_b81574d3/

@matmarex implemented Handlers as Echo would provide a mechanism to handle notification. Where Providers sounds more like a extension that would provide notifications, but it may not handle it.

That's fair, I was thinking more about the Provider pattern, but I can change the naming throughout to Handlers (and maybe mentioning it follows the Provider pattern for context) if that helps the conversation either way.

Personally, I'm see those more as "Delivery" mechanism, that you have a notifications and you ask something to deliver those.

Echo is doing a lot more than "just" the delivery; it manages a queue, prioritization, bundling and batching ,etc, so I think of it more like the replacement of the notification management.

On top of that, the aspirational design allows us to (at some point in the future; this is out of scope for our current work) to add Handlers (or "Providers") for delivery mechanisms so that the delivery itself (email vs popup vs telegram/irc/etc) can be managed better and be more flexible for developers to utilize -- so I try to avoid mixing "Delivery" concept, mentally, to avoid that confusion.

There are pros and cons for either of those options, and conversations so far have leaned heavily towards #2 (no Fallback)

I got the opposite impression, and I implemented option #1 in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1108511. I'd be okay with doing #2 too, but for me that increases the scope of the project, since now we have to make sure that every provider/handler supports every kind of notification, or the whole system will drop some notifications on the floor.

(I don't care if it's called "provider" or "handler", they are both meaningless words, so I chose the shorter one.)

There are pros and cons for either of those options, and conversations so far have leaned heavily towards #2 (no Fallback)

I got the opposite impression, and I implemented option #1 in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1108511. I'd be okay with doing #2 too, but for me that increases the scope of the project, since now we have to make sure that every provider/handler supports every kind of notification, or the whole system will drop some notifications on the floor.

(I don't care if it's called "provider" or "handler", they are both meaningless words, so I chose the shorter one.)

There's an argument to be made about the complexity and what to do with unrecognized notifications.

In option #1, the providers themselves don't HAVE to know about all other providers -- there could be a registry that maps known events/notifications to provider by priority. We already have that information being submitted into Echo in the form of registering event types, so we can utilize that when providers are set up -- register or pass the recognizable notifications to a mapping that is then used to know where a notifications goes.

In option #2, the idea is that there is always only ONE provider that "takes over" the behavior. It's either core itself (with the lean provider that we set up for email only) or whatever-else (in our case Echo).

Both options need to figure out what to do with a completely unrecognizable notification. In case #1, we could just say that notificatiosn that are in no mapping will go all the way down to the core default provider and just be sent as-is with the email-only provider. But that's true for option #2, where if you send Echo an unrecognized notification, it sends it down to Core's provider.

The alternative is that if there's no provider for the notification, the notification is ignored. I'm not sure if that's a good option, but it is an option.

So, I think that there are two topics here, potentially -- what we do with new/undefined/unrecognizable notification types, and whether we want to create a series of prioritized fallbacks.

We could also do #2 (Echo takes over as a Provider without any others) but with a bit of a mix; the system can check first if Echo recognizes the event type, and if it doesn't, it sends it to the core provider as fallback (without having a whole system of prioritized providers, which seems a bit elaborate and probably unnecessary).

I want to outline my concerns around providing (or not providing) a mechanism for "fallback" to use MW Core's straightforward 'email only' provider even if Echo is the available provider. I think that having a Provider take over the entire behavior is a valid course of action, and works with our intended architectural concerns, but there is one aspect that is user-facing that we should be aware of, and make a decision on.

While having watchlist notifications be a feature that pass through Echo (so Echo can handle the behavior) is a long requested feature (See T203941: Allow watchlist notifications to be delivered as web notification (through Echo) and the related/child tasks) this change does mean a bit of a user-facing change from current behavior. It seems that there's agreement that it's a *better* behavior than current one, but it *is* a change.

If we have a single provider without the ability to have fall-back, we commit to not only unifying the notifications behavior within core with notifications outside of core in terms of the calls and ease of development and unifying behavior (current scope, and current work), but we are also committing to having enotif be handled by Echo, which opens the door to user-facing change.

This isn't necessarily a bad thing, but it's certainly a risk, especially given the fact that if and when we make that change, we may encounter other things to work on to adjust it based on user feedback. We need to assess whether that work is feasible (implementing *and* leaving ourselves enough time for launch and testing this feature with potential fixes) for the allotted time for the work (until the end of the fiscal year).

If we *do* have a fallback mechanism for the provider, the work becomes less risky. We can then move enotif to the new system but allow it to still be handled by Core's Provider -- which is straight-to-email behavior, no user-facing change from the current behavior. Then we could look at whether we (or the Growth team) want to more safely and flexibly move watchlist notifications to Echo (with the popup, etc). This would allow us to "bail" in case there are issues we didn't anticipate, *and/or* commit a user-facing change with more flexibility.

Given that the request to have watchlist notifications in Echo was agreed on by community (based on previous tickets) this might not be a reason to complicate the underlying architecture, but I would like to raise this potential issue, and we should talk to and discuss this with our product folks (potentially from Growth team as well, as they are technically the owners of Echo's behavior) and see what are the requirements on how user-facing features change.

Another small point here: We could implement watchlist notifications into Echo with the new system and have the "show in popup" be disabled initially (effectively having Echo manage the notifications but only send an email, mimicking current behavior). That would be less risky, since it mimics current behavior, but it might still have implications for things like notification count (limit 2000 in the db per user, etc) that we should just *verify* that we are not getting ourselves into a bind that will move us out of scope of time or work for this.

We've had a conversation about this in the tech meeting, and here's our thoughts and summary:

  1. We need to have multiple handlers at the same time. To make sure that we avoid introducing user-facing interference with the migration of enotif, *and* to enable the theoretical possible need for an extension to require a specific and distinct mechanism for its own notifiaciton, we will enable the system to have multiple Handlers. Echo will serve as one of those Handlers.
  2. There will be a registry of handled types in each handler, with wildcards (wildcard handlers are only invoked if there is no specific type match). We discussed multiple ways that we can manage multiple Handlers, but decided that a registry of event types (where each provider can specify the event types it controls, with an option for a catch-all wildcard, will solve multiple issues. It will allow us to enable the general architecture where Echo is the primary handler of notifications in the system (wildcard) but also allow us to be granular if we need it; in our current case, we can take advantage of this to temporarily assign enotif-notifications to an alternative email-only provider, until they are moved to be handled by Echo.
  3. Individual handlers do not have fallbacks, they must handle types they register.
  4. Echo must handle unknown notification types, since it is a wildcard handler. In effect, there is no individual fallback process for Handlers internally, but having a wildcard handler makes it the "fallback" for any notification that is not otherwise specified specifically for any other handler. This also helps us answer T385839: Allow Echo to handle generic notifications
  5. Core itself will not have a base Handler in general. All notifications will go through the given provided handler, which, in our case, is Echo. Echo is bundled with mediawiki's tarball and is already carrying most of the behavior, so 3rd party installations may not have notification behavior without Echo. That said, we will provide a temporary Legacy Handler that will handle watchlist notifications until those are migrated properly to Echo (enotif migration).

All of the above will allow us to deliver the structure and architecture we want to see while making sure we don't increase risks with user-facing behavior change, and we can enable the team that owns those notifications (watchlist/etc, or Echo) to quickly and more seamlessly migrate watchlist notifications into Echo on their own terms in the (hopefully near) future.

Thank you @Mooeypoo for the update.

I'm wondering - did you consider an option where we can let Echo Handler fallback to MediaWiki - instead of implementing fallback handling inside MediaWiki? What I mean, for simplicity - if we want to make Echo handle everything anyway, plus the MediaWiki Legacy Handler will most likely be a service anyway. Then we would have only one "active" handler (in this case Echo), but then code in Echo Handler would have hardcoded fallback logic like if I don't know this Event, let me call the Legacy Handler.

Something across the lines

class EchoHandler implements NotificationHandler {

        private NotificationHandler $legacyHandler; // injected via ServiceWiring

	public function notify( Notification $notification, RecipientSet $recipients ): void {
             if ( !$this->isKnownNotification( $notification ) {
                 $this->legacyHandler->notify( $notification, $recipients );
                 return;
             }
             [...]
         }

        /**
          * @TODO: ugly, copy&paste from Event::create() just to demonstrate the idea
          */
        private function isKnownNotification( Notification $notification ) {
                global $wgEchoNotifications;
		return isset( $wgEchoNotifications[$notification->getType()] );
        }
}

It may not be most the beautiful code, but if the LegacyHandler is a temporary measure and the final goal is to make Echo handle all notifications then I'm not sure if we want to pollute the MediaWiki with handling multiple handlers.

Do you have an example, where - after the migration of all notifications to Echo - we would still need multiple Handlers (when Echo is handling .*) ?

I'm also still concerned about the default behaviour for Core. On Wikis Echo extension is present, therefore Notifications will be handled. But if we have a clean MediaWiki installation, those notifications will land in a "black hole".

This approach means, that when you have a MediaWiki installation, with for example just a Thanks extension. Then if Thanks triggers the Notification, the notification will be ignored and execution will follow us if nothing happened.

The current implementation will log a warning when it cannot handle notification (no handler defined for that type) - see https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1120623. I wonder if this is enough, or is there anything else Core should do?

Yes, notifications will not work on a clean installation of core without a provider -- either Echo, or another provider. This is current behavior, and is acceptable. Echo is bundled with MediaWiki, so 3rd parties will retain behavior, and anyone that installs "core" alone will need to add Echo for functionality, just like they need to add a skin (and just like current behavior).

All extensions check for Echo's existence today before sending notifications (or are dependent on it). We could later theoretically remove that check, actually, since it won't be a breaking one when the functionality goes through Core -- it would just result in no notification, which is expected (and is current behavior).

Thank you @Mooeypoo for the update.

I'm wondering - did you consider an option where we can let Echo Handler fallback to MediaWiki - instead of implementing fallback handling inside MediaWiki? What I mean, for simplicity - if we want to make Echo handle everything anyway, plus the MediaWiki Legacy Handler will most likely be a service anyway. Then we would have only one "active" handler (in this case Echo), but then code in Echo Handler would have hardcoded fallback logic like if I don't know this Event, let me call the Legacy Handler.

If Echo is defined as the "catch all" provider (see #4 in the list above) it will need to handle them as some sort of "base" notification, which is a sort of 'fallback' behavior. We just won't have a fallback as a form of the system. Echo, as the "catch all" provider, will be the thing that all notifications that are not registered specifically to another handler, will be handled, meaning we will need to have some sort of base notification there. But also we need to be careful not to go too far here; current behavior is that unrecognized notification is unhandled completely. The above solution will mean extensions and developers can send "basic" notifications that will be caught by Echo (a sort of fallback) but we need to not go overboard with creating all manners of fallback.

Let's start with this decision that retains current behavior while maintaining product spec and some basic future aspiration, make it work properly, and then we can always iterate :)

This ticket is resolved as part of the yearly work, and the details were worked on within implementation tickets around the Handlers (alternative word to "Provider") and the migration of Enotif to the new system. We now have a system in core that allows for a sinlge pathway through the Middleware to different Handlers, where Echo is a "catchall" handler; we also made sure Echo can handle an unknown notification.

See more information in the Notifications manual on MW and the project updates.