Page MenuHomePhabricator

Back-end infrastructure for timed notifications in Echo
Open, Needs TriagePublic

Description

This is a proposal for a feature I'd like to add to the Echo extension. I thought I had better pass it by the Collaboration team before working on it, to see if it fits in with their vision for the extension's architecture.

What?

Add infrastructure to the Echo extension to enable proactive scheduled notifications.

Why?

At the moment Echo is a purely reactive extension; notifications are created from various hook functions in response to actions being taken by the user. This stops us from being able to offer nice features which require notifications to be issued at a set time:

How?

The proposed implementation is to add a column event_scheduled_timestamp to the echo_event table. This column would be NULL except for events scheduled to take place in the future, in which case it would contain the timestamp of when to show the notification.

A maintenance script would be run via cron, as frequently as possible (ideally every few minutes, but hourly would be workable too), to issue notifications for events that have passed the scheduled time. It would delete the row from the echo_event table and then fire it as a non-delayed event.

There needs to be a standard way to cancel these notifications by ID (e.g. if the user groups are assigned permanently/revoked entirely and thus won't expire, or the user cancels the article reminder)

There should be a EchoEvent::createScheduled that returns either a $scheduled_event_id (event_id from echo_event) or an EchoEvent (can an EchoEvent exist if it hasn't been fired yet?) in which case getId would similarly be used.

To cancel, you could call EventMapper::cancelScheduledById, which would delete it from echo_event.

FAQ

What would use this infrastructure?

T2582 could potentially be solved by storing reminders as scheduled Echo events, or inserting scheduled Echo events in parallel with metadata in another table.

To solve T153817, the cronjob-executed maintenance script could be extended to also look at core's user_groups table and identify user groups whose expiry date is impending, or are going to expire in two days or two weeks.

Why couldn't the cronjob directly query user_groups and issue notifications immediately? Then we wouldn't need the extra column on echo_events.

We want to be able to issue a notification like "Your user rights have expired." Since the row is liable to disappear from user_groups any time after the group membership expires, we have to detect user_groups that are about to expire, place them in a holding location (echo_events with echo_scheduled_timestamp non-NULL), then issue the notification after the expiry time has passed.

Related Objects

Event Timeline

TTO created this task.Jan 31 2017, 3:23 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 31 2017, 3:23 PM

Before you filed this, we happened to list T2582: Remind me of this article in X days as a Rails Girls Summer of Code task. So if we accept someone for that (applications close Mar 1, acceptances May 1), this (or something similar) will be part of that project.

If we don't accept anyone, I'll unclaim this.

Technical note below:

To solve T153817, the cronjob-executed maintenance script could be extended to also look at core's user_groups table and identify user groups whose expiry date is impending, or are going to expire in two days or two weeks.

These seem like two different scripts that should not be combined into one:

  1. One to convert "scheduled Echo notification" to "Echo notification"
  2. One to create "scheduled Echo notifications" based on groups that are about to expire. This is a specific notification type, that doesn't even necessarily have to be in the Echo extension (though maybe we would put it there).

How does this fit in with T88781: Create a Timer based reminder for workflows?

No idea. I can't tell what the difference is between that task and T2582.

T88781: Create a Timer based reminder for workflows has a spec too unclear to work on. It can be re-evaluated (either better-specified or closed) after T2582: Remind me of this article in X days is done.

Mattflaschen-WMF renamed this task from Proposal: Timed notifications in Echo to Back-end infrastructure for timed notifications in Echo.Feb 2 2017, 12:12 AM
TTO added a comment.Apr 14 2017, 4:28 AM

I'm guessing no-one claimed this for Rails Girls?

I'm guessing no-one claimed this for Rails Girls?

The parent task is being considered for Google Summer of Code and Outreachy.

Quiddity removed a subscriber: Quiddity.

This is needed for T2582: Remind me of this article in X days, an active Outreachy project.

@aaron What kind of performance guarantees does getReleaseTimestamp have?

  1. Are there any concerns if we put a job in to be run a long time from now (e.g. a month or two, maybe more)? (The exact available delays aren't ironed out, so if e.g. a couple weeks is fine, we can see if that's enough product-wise.)
  2. Is there any possibility of such long-delayed jobs being lost?
  3. Is the job guaranteed to run at/after the release timestamp (not a little before)?
  4. How long might it be delayed after the release timestamp?
  5. Is it a problem if we create a lot of these delayed jobs?

It doesn't look like there's a clean way to delete existing jobs. We might be able to edit them with deduplicateRootJob, but I don't see a way to delete them or list them (which has already been mentioned). So we might want to go back to a DB-based solution.

Added a note to description, about cancelling a scheduled event.

TTO added a subscriber: MaxSem.Jun 21 2017, 1:21 AM

Adding @MaxSem to this so we're all on the same page.

aaron added a comment.Jun 21 2017, 4:58 PM

@aaron What kind of performance guarantees does getReleaseTimestamp have?

  1. Are there any concerns if we put a job in to be run a long time from now (e.g. a month or two, maybe more)? (The exact available delays aren't ironed out, so if e.g. a couple weeks is fine, we can see if that's enough product-wise.)
  2. Is there any possibility of such long-delayed jobs being lost?
  3. Is the job guaranteed to run at/after the release timestamp (not a little before)?
  4. How long might it be delayed after the release timestamp?
  5. Is it a problem if we create a lot of these delayed jobs?

1/2/5. There is no time limit, though at some point space could be a concern (e.g. dozens of GB). In terms of data loss, replication is flakey at the moment due to redis bugs (T163337). No one has cleared redis lately, but that happened in the past before (unable to load rdb file).

  1. It won't run before the job release timestamp.
  2. How long after depends on the backlog in that queue and on DB replag (e.g. https://grafana.wikimedia.org/dashboard/db/job-queue-rate?panelId=8&fullscreen&orgId=1).
Mattflaschen-WMF updated the task description. (Show Details)
Eee888 added subscribers: jcrespo, Marostegui.
MaxSem added a comment.Aug 8 2017, 1:25 AM

I probably don't know Echo internals enough, but some things in this plan sound iffy to me: if you want to destroy events by ID, you would need some infra to store these IDs. Thus, we get both changes to the main echo schema as well as each application of timed events needing in practice its own table. This does not simplify anything compared to my partial approach in https://gerrit.wikimedia.org/r/#/c/350500/ , and might as well be the opposite.

TTO added a comment.Aug 8 2017, 5:03 AM

if you want to destroy events by ID, you would need some infra to store these IDs.

Couldn't we just do SELECT event_id, event_extra FROM echo_event WHERE event_type = 'user-rights-expiry' AND event_agent_id = ..., do client-side filtering on event_extra, and then call EventMapper::cancelByID with the event_id we selected? No extra schema changes required.

I probably don't know Echo internals enough, but some things in this plan sound iffy to me: if you want to destroy events by ID, you would need some infra to store these IDs. Thus, we get both changes to the main echo schema as well as each application of timed events needing in practice its own table. This does not simplify anything compared to my partial approach in https://gerrit.wikimedia.org/r/#/c/350500/ , and might as well be the opposite.

Not every user of timed events needs its own table. For example, article reminder notifications do not. They just need the ability to look up event IDs by event_page_id and event_type, both of which there are indices on.

You can already lookup by title with API:Notifications, and we'll add the ability to filter by event_type.

@Mattflaschen-WMF is this task still locked for Google Summer of Code or Outreachy? Asking cause its preventing others from working on T189391

It's not locked; The code is available, we need to review, fix it, and merge. Is anyone actively working on T189391: New block option: Notify me when this block is about to expire or has expired ?

@Mooeypoo Task T189391 claims that its being blocked by this one and I do have interest in working on the task.

Aklapper removed Eee888 as the assignee of this task.Sep 26 2018, 3:13 PM

@Eee888: I am resetting the assignee of this task because there has not been progress lately (please correct me if I am wrong!). Resetting the assignee avoids the impression that somebody is already working on this task. It also allows others to potentially work towards fixing this task. Please claim this task again when you plan to work on it (via Add Action...Assign / Claim in the dropdown menu) - it would be welcome! Thanks for your understanding!

Restricted Application added a project: Growth-Team. · View Herald TranscriptSep 26 2018, 3:13 PM
SBisson moved this task from Inbox to Triaged but Future on the Growth-Team board.Sep 28 2018, 2:26 AM