Page MenuHomePhabricator

Back-end infrastructure for timed notifications in Echo
Open, Needs TriagePublicFeature

Description

This is a proposal for a feature I'd like to add to the Echo extension. I thought I had better pass it by the Collaboration team before working on it, to see if it fits in with their vision for the extension's architecture.

What?

Add infrastructure to the Echo extension to enable proactive scheduled notifications.

Why?

At the moment Echo is a purely reactive extension; notifications are created from various hook functions in response to actions being taken by the user. This stops us from being able to offer nice features which require notifications to be issued at a set time:

How?

The proposed implementation is to add a column event_scheduled_timestamp to the echo_event table. This column would be NULL except for events scheduled to take place in the future, in which case it would contain the timestamp of when to show the notification.

A maintenance script would be run via cron, as frequently as possible (ideally every few minutes, but hourly would be workable too), to issue notifications for events that have passed the scheduled time. It would delete the row from the echo_event table and then fire it as a non-delayed event.

There needs to be a standard way to cancel these notifications by ID (e.g. if the user groups are assigned permanently/revoked entirely and thus won't expire, or the user cancels the article reminder)

There should be a EchoEvent::createScheduled that returns either a $scheduled_event_id (event_id from echo_event) or an EchoEvent (can an EchoEvent exist if it hasn't been fired yet?) in which case getId would similarly be used.

To cancel, you could call EventMapper::cancelScheduledById, which would delete it from echo_event.

FAQ

What would use this infrastructure?

T2582 could potentially be solved by storing reminders as scheduled Echo events, or inserting scheduled Echo events in parallel with metadata in another table.

To solve T153817, the cronjob-executed maintenance script could be extended to also look at core's user_groups table and identify user groups whose expiry date is impending, or are going to expire in two days or two weeks.

Why couldn't the cronjob directly query user_groups and issue notifications immediately? Then we wouldn't need the extra column on echo_events.

We want to be able to issue a notification like "Your user rights have expired." Since the row is liable to disappear from user_groups any time after the group membership expires, we have to detect user_groups that are about to expire, place them in a holding location (echo_events with echo_scheduled_timestamp non-NULL), then issue the notification after the expiry time has passed.

Related Objects

Event Timeline

Before you filed this, we happened to list T2582: Remind me of this article in X days as a Rails Girls Summer of Code task. So if we accept someone for that (applications close Mar 1, acceptances May 1), this (or something similar) will be part of that project.

If we don't accept anyone, I'll unclaim this.

Technical note below:

To solve T153817, the cronjob-executed maintenance script could be extended to also look at core's user_groups table and identify user groups whose expiry date is impending, or are going to expire in two days or two weeks.

These seem like two different scripts that should not be combined into one:

  1. One to convert "scheduled Echo notification" to "Echo notification"
  2. One to create "scheduled Echo notifications" based on groups that are about to expire. This is a specific notification type, that doesn't even necessarily have to be in the Echo extension (though maybe we would put it there).

How does this fit in with T88781: Create a Timer based reminder for workflows?

No idea. I can't tell what the difference is between that task and T2582.

T88781: Create a Timer based reminder for workflows has a spec too unclear to work on. It can be re-evaluated (either better-specified or closed) after T2582: Remind me of this article in X days is done.

Mattflaschen-WMF renamed this task from Proposal: Timed notifications in Echo to Back-end infrastructure for timed notifications in Echo.Feb 2 2017, 12:12 AM

I'm guessing no-one claimed this for Rails Girls?

I'm guessing no-one claimed this for Rails Girls?

The parent task is being considered for Google Summer of Code and Outreachy.

@aaron What kind of performance guarantees does getReleaseTimestamp have?

  1. Are there any concerns if we put a job in to be run a long time from now (e.g. a month or two, maybe more)? (The exact available delays aren't ironed out, so if e.g. a couple weeks is fine, we can see if that's enough product-wise.)
  2. Is there any possibility of such long-delayed jobs being lost?
  3. Is the job guaranteed to run at/after the release timestamp (not a little before)?
  4. How long might it be delayed after the release timestamp?
  5. Is it a problem if we create a lot of these delayed jobs?

It doesn't look like there's a clean way to delete existing jobs. We might be able to edit them with deduplicateRootJob, but I don't see a way to delete them or list them (which has already been mentioned). So we might want to go back to a DB-based solution.

Added a note to description, about cancelling a scheduled event.

Adding @MaxSem to this so we're all on the same page.

@aaron What kind of performance guarantees does getReleaseTimestamp have?

  1. Are there any concerns if we put a job in to be run a long time from now (e.g. a month or two, maybe more)? (The exact available delays aren't ironed out, so if e.g. a couple weeks is fine, we can see if that's enough product-wise.)
  2. Is there any possibility of such long-delayed jobs being lost?
  3. Is the job guaranteed to run at/after the release timestamp (not a little before)?
  4. How long might it be delayed after the release timestamp?
  5. Is it a problem if we create a lot of these delayed jobs?

1/2/5. There is no time limit, though at some point space could be a concern (e.g. dozens of GB). In terms of data loss, replication is flakey at the moment due to redis bugs (T163337). No one has cleared redis lately, but that happened in the past before (unable to load rdb file).

  1. It won't run before the job release timestamp.
  2. How long after depends on the backlog in that queue and on DB replag (e.g. https://grafana.wikimedia.org/dashboard/db/job-queue-rate?panelId=8&fullscreen&orgId=1).

I probably don't know Echo internals enough, but some things in this plan sound iffy to me: if you want to destroy events by ID, you would need some infra to store these IDs. Thus, we get both changes to the main echo schema as well as each application of timed events needing in practice its own table. This does not simplify anything compared to my partial approach in https://gerrit.wikimedia.org/r/#/c/350500/ , and might as well be the opposite.

if you want to destroy events by ID, you would need some infra to store these IDs.

Couldn't we just do SELECT event_id, event_extra FROM echo_event WHERE event_type = 'user-rights-expiry' AND event_agent_id = ..., do client-side filtering on event_extra, and then call EventMapper::cancelByID with the event_id we selected? No extra schema changes required.

I probably don't know Echo internals enough, but some things in this plan sound iffy to me: if you want to destroy events by ID, you would need some infra to store these IDs. Thus, we get both changes to the main echo schema as well as each application of timed events needing in practice its own table. This does not simplify anything compared to my partial approach in https://gerrit.wikimedia.org/r/#/c/350500/ , and might as well be the opposite.

Not every user of timed events needs its own table. For example, article reminder notifications do not. They just need the ability to look up event IDs by event_page_id and event_type, both of which there are indices on.

You can already lookup by title with API:Notifications, and we'll add the ability to filter by event_type.

@Mattflaschen-WMF is this task still locked for Google Summer of Code or Outreachy? Asking cause its preventing others from working on T189391

It's not locked; The code is available, we need to review, fix it, and merge. Is anyone actively working on T189391: New block option: Notify me when this block is about to expire or has expired ?

@Mooeypoo Task T189391 claims that its being blocked by this one and I do have interest in working on the task.

@Eee888: I am resetting the assignee of this task because there has not been progress lately (please correct me if I am wrong!). Resetting the assignee avoids the impression that somebody is already working on this task. It also allows others to potentially work towards fixing this task. Please claim this task again when you plan to work on it (via Add Action...Assign / Claim in the dropdown menu) - it would be welcome! Thanks for your understanding!

Maybe we should break it down into even smaller bits.

Setting the goal: to have some functionality that allows the user to plan echo notifications with custom content. Precision: I'd say being accurate within 1-2 hours would generally be sufficient, but the higher the precision the better.

  • Storage: a table of some sort is required.
    • How big?
    • Format?
    • I do believe it should be private, and it should be stored server-side. Could it be stored along with the user preferences? (I have no idea how user preferences are stored other than that they're not stored in your browser)
    • I believe it would be best to have functionality that allows creating a private table without specifying what it'll be used for. This would allow it to be used for reminders, personal notes, configuration settings for gadgets, etc.
    • Perhaps allow a table with (mostly) numbered columns? Some fixed fields could exist like "app" (name of the gadget, extension or whatever that stores data) and "timestamp". Some numbered fields could allow the gadget/etc to store whatever they need to store. It would be up to the gadget to know what number corresponds to what kind of data.
  • How could the table be accessed? I'm thinking something like:
    • Write: /w/personaldata.php?session=0DDF00D&insert=1&app=reminderapp&timestamp=1589958503&column1=Check the sources of this article.&column2=List of tadpoles in Germany
    • Read (VARNOW would be changed to a timestamp server-side): /w/personaldata.php?session=0DDF00D&select=timestamp,column1,column2&where1=app&is1=reminderapp&where2=timestamp&is2=lt_VARNOW
    • Delete: /w/personaldata.php?session=0DDF00D&delete=1&where1=app&is1=reminderapp&where2=timestamp&is2=1589958503&column1=Check the sources of this article.&column2=List of tadpoles in Germany
  • User interface?
    • HTML/JS boo boring stuff might as well leave it to the community
  • Scheduler?
    • Instead of a cron job, consider having the user make a request on every page load. Cache it in the browser cache for 10 minutes or however long you wish. User doesn't actually log in? No notifications for you.
  • How to deliver?
    • Allow the user to deliver an instant echo notification to themselves. May not even strictly be needed for this case.. could be useful anyways. For example: /w/api.php?action=echo&message=Article: List of tadpoles in Germany. Check the sources of this article.

How it comes together:

  1. User loads a page while logged in.
  2. Page loads /w/timednotifications.php (cached in browser cache for 10 minutes)
  3. Whenever retrieved from the server, timednotifications.php requests /w/personaldata.php?session=0DDF00D&select=timestamp,column1,column2&where1=app&is1=reminderapp&where2=timestamp&is2=lt_VARNOW
  4. If anything is returned, generate echo notifications and delete the entries or use column3 to mark them as delivered.

This is just a (very) rough idea. Start at the beginning: allow the user to store and retrieve private data. The user already has private data, for example their watchlist and preferences, perhaps this information could be stored the same way.

Aklapper changed the subtype of this task from "Task" to "Feature Request".Apr 24 2023, 2:53 PM

A cronjob polling a mariadb table, as described in the task description and partially implemented for Article-Reminder-Notifications, IMHO would not be a sustainable solution.

The JobQueue has had support for scheduled jobs since 2013 - although it's only for the Redis backend. Since 2020, Echo has also had the support for scheduled notifications iff used with a compatible jobqueue backend. Since WMF prod uses Kafka - which doesn't really have a notion of scheduled messages, we should consider using a second job queue backend specifically for scheduled jobs. (Thankfully, MW supports having multiple jobqueue backends at the same time for different job categories).

Other than Redis itself, it would be worth considering RabbitMQ or ActiveMQ as both support scheduled message delivery. The RFC task T218812 appears to have been abandoned once a different solution was pursued for its original usecase.

To support cancellation of scheduled jobs, I think we would need another table which persists a cancel_key (a string that can be generated when cancellation is sought, eg. groupexpiry_<userid>_<groupname>) for every cancellable scheduled job, along with a cancel_id (some identifier returned by the queue that can be used to cancel).