The CommTech team received two wishes for 2019 which may require an ability to have jobs in a job queue that will be executed at a specific time. This tasks exists as a place to start the conversation about how we might implement this feature and who should implement it. We'll use this to identify specific feature needs and approach some consensus on implementation details.
The two wishes that would benefit from this capability are [[ https://meta.wikimedia.org/wiki/Community_Wishlist_Survey_2019/Watchlists/Watchlist_item_expiration | Watchlist Expiry ]] and [[ https://meta.wikimedia.org/wiki/Community_Wishlist_Survey_2019/Notifications/Article_reminders | Article Reminders ]].
Specifically, here's how it might work for each:
* If a user adds an expiring item to their watchlist, when the expiration date/time is reached, the item is removed from their Watchlist.
* If a user wants to be reminded to edit/read/etc. a specific page, they can set a reminder that alerts them (via Echo and/or email) to go to that page for whatever reason.
A naive thought about this would be to have the ability to have jobs that can fire at a specific date/time. That could be achieved by allowing the developer to provide an "execution timestamp" when the job is added to the queue. The queue then would have code to manage checking these timestamps and determining what to do. Or, the developer could provide an offset (in seconds?) from the job creation timestamp. Again, the job queue would have the code to determine if offset has been reached and the job should be fired.
Based on my reading of the [[ https://www.mediawiki.org/wiki/Manual:Job_queue | docs for the existing job queue ]], this use case might be slightly different. It's different in that it the current job queue seems to have grown out of the need to handle delayed processing based on something happening in the page or the web request. A generic delayed job queue could potentially be used completely outside the context of page requests. I don't think it **must** be different but it **could** be different.
There are issues with a system like this. One issue is having lots of delayed jobs clogging up the queue for things that need to happen more immediately. That could be solved by having two queues with different listeners. I'm thinking of something more like a job broadcast or pubsub kind of model. It might be as straightforward as having different job runners looking at the DB for jobs with different flags. Another issue can be dealing with failed jobs. In many short-lived jobs, it's clearer about what to do should the job fail. For something that has been in a queue for a year, the user's expectations become less clear.
I've previously used [[ http://docs.celeryproject.org/en/master/index.html | Celery ]] to accomplish this kind of delayed job queue. I'm not recommending that specific technology but I think the [[ http://docs.celeryproject.org/en/master/userguide/calling.html#eta-and-countdown | approach documented here ]] might be helpful for discussion.
I think we have the opportunity to do one of two things here:
* Find a work-around. That is, design a very small way to do this without an actual queue-based implementation. The Watchlist Expiry is something we could certainly work around but Article Reminders might be more challenging.
* Add a new capability to our job queue. Add this delayed job capability for all users of the job queue.