Page MenuHomePhabricator

[RfC] Add CURRENT_TIMESTAMP support for `wl_notificationtimestamps` in watchlist table
Closed, DeclinedPublic

Description

Looking at core/maintenance/archives/patch-watchlist-null.sql, the watchlist table wl_notificationtimestamp attribute was introduced support for NULL by default and so doing, adding an article to watchlist table doesn't add it's current timestamp (tested locally) but adds NULL instead.

Sometimes, users watchlist entries can grow as large as >700 pages/articles (use-case) and knowing the latest article added to watchlist is a bit difficult (taking into consideration the user can't remember the name of this particular article) so I was experimenting on building an extension (say WatchListFilter extension maybe?) to filter watchlist based on various parameters (like desc order of ts, asc order of timestamp, date added, etc) but by default, based on their timestamps added.

So, filtering by timestamp (default) will make the latest added article appear first (DESC order by timestamp) on the Special:EditWatchlist page which is useful but this means the watchlist table needs to be altered a little bit I guess?

I believe MW has some way of dealing with TS, maybe this: https://www.mediawiki.org/wiki/Manual:Timestamp and if so, should the table be altered to avoid NULL by default or should it be as-is and just some logic added to include saving these TS when article is added to watchlist by user instead of NULL?

Would like to have input on this please. Thanks!

Other topics as per how the filter will look on the Special:EditWatchlist special page based on the OOUI-fication that has happened and others can be addressed on other tickets, maybe sub-tasks of this one I guess :) Also, not sure of what tags to add to this ticket so added a few.

Event Timeline

xSavitar renamed this task from [RfC] Add CURRENT_TIMESTAMP support in `wl_notificationtimestamps` in watchlist table to [RfC] Add CURRENT_TIMESTAMP support for `wl_notificationtimestamps` in watchlist table.Nov 1 2018, 11:27 AM
xSavitar updated the task description. (Show Details)
xSavitar updated the task description. (Show Details)
aaron removed aaron as the assignee of this task.Nov 8 2018, 7:19 AM
aaron subscribed.
daniel subscribed.

Moving this to the backlog of the RFC board, indicating that this ticket needs to be fleshed out more to make it a viable RFC.

If I understand correctly, the idea is to initialize wl_notificationtimestamp to the current time, instead of null, when adding a watchlist entry. Looking at WatchedItemStore, this should be trivial to do. An RFC may be useful to ensure that this change does not have unintended side effects. Other than that, I see no need for an RFC here, since there are not cross-cutting, strategic, or hard-to-undo concerns.

The motivation (listing watchlist entries by the time the page was last visited) could be read as a feature request for core. A technical RFC would not be the right process to decide on that. On the other hand, supporting behavior implemented in an extension may not be sufficient grounds for changing behavior in core, if the change is not trivial).

Please update the description to clearly state the proposed technical change, the intended effect, and the motivation. Please also state any alternative solutions and possible side-effects you may have investigated. When you think the task is ready, move it into the "inbox" for review, or directly to "request IRC meeting". If the proposed change turns out to be trivial (or unnecessary or undesirable), an RFC discussion may not be needed, and the proposal may go on Last Call immediately (to be approved or rejected, respectively).

wl_notificationtimestamp is not meant to store the time the article was watched but the last revision the user saw on the page (NULL if they saw the latest revision). This would require a new column. Ideally, if watchlist sizes were limited, this woudn't need an index, but they are not.

@aaron if I understand correctly, the intention is not to permanently record when the page was watched. The intention is instead to initialize the "last viewed" timestamp to the time the page was watched. This makes sense semantically: Assuming a page that is added to the watchlist was never viewed makes little sense. Asssuming that the user was looking at that page when they added it to the watchlist makes a lot more sense. With respecct to the functionality controlled by wl_notificationtimestamp this should make no difference: either way, no notifications will be shown for the time prior to the point at which the page was added to the watchlist.

If I understand you correctly, wl_notificationtimestamp uses the revision's timestamp rather than system time. So the proposal would be changed to initialize the field to the current revision's timestamp, not current wall clock time.

In essence, I support the idea that wl_notificationtimestamp should never be null, and should be initialized to the current revision's timestamp, simply because it makes more sense semantically, and it removes a special case.

I think I understand from the perspective of @aaron about that table not intended for this purpose (that we'll need to create another attribute to handle this idea). Having another look at the attribute name wl_notificationtimestamp, it sounds a little different from what should be achieved with this proposal. So this proposal in summary is to have a table attribute to keep track of timestamps of when an article is added (by a user) to his/her watchlist (maybe wl_timestamp?), so that, watchlist entries can later in the future be filtered to easily find them because (for example, I have over 500 articles on my mw.org watchlist) and trying to find which one I added for example yesterday is not easy if I don't know the article name but filtering my watchlist entries based on timestamp can make me easily fine what I'm looking for.

But yes I support @daniel about the wl_notificationtimestamp not being NULL. I mean why should it be the case saving NULLs all around? "Extract from @aaron: but the last revision the user saw on the page (NULL if they saw the latest revision).", I think rather than saving a NULL, we should do what @daniel proposed. But if there is a way to still get such info on making the filter without adding an extra attribute in the watchlist table, I'm good to use it :). Introducing this feature in core sounds more than a good idea to me, thanks @daniel.

NULLs can be good or bad depending on their use-cases. Implementation wise (programming), this will introduce a three-value logic (true, false, null) making implementation sometimes harder etc but based on DBMS, NULLs can be good as they will take no space if the field size is not fixed (reducing database size) but otherwise, it will use up the size of the db field if fixed. So depending on how NULLs are being used in wl_notificationtimestamp, I will follow Daniel here :)

Also, as @daniel rightly said, I'll still flesh up this proposal a little bit to add more context based on the template proposed so everything is a little clearer.

For wl_notificationtimestamp, we set it to NULL when the user views the latest revision of the page, and when the page is edited we set all non-NULL entries to the time of the edit. This allows us to trivially find all watchlist entries where the page has/hasn't been edited since the last view: "has been edited" has wl_notificationtimestamp IS NOT NULL, while "hasn't been edited" has wl_notificationtimestamp IS NULL.

If it were never NULL, we'd have to join to page and revision to compare the timestamp of the most recent edit everywhere where that determination is needed.

IMO as written this task should be declined. If @D3r1ck01 wants to rewrite it completely to propose adding a wl_addedtimestamp column, that could be done but would be a completely different request than what has been discussed here so far.

In essence, I support the idea that wl_notificationtimestamp should never be null, and should be initialized to the current revision's timestamp, simply because it makes more sense semantically, and it removes a special case.

Setting it to the current revision's timestamp would mean that the current revision hasn't been viewed, so the page would be highlighted in the watchlist and the latest revision would be highlighted on the history page.

Thanks @Anomie, this is all sounding clearer. I think in that case, this can be declined as you've suggested but before that, I think @daniel should see your comment (maybe has something to say?) before declining the task (as it's clearly 2 topics being discussed here now so far)? I'll file another RfC about adding a new field (wl_addedtimestamp) in the watchlist table for tracking when entries are being added to watchlist.

For wl_notificationtimestamp, we set it to NULL when the user views the latest revision of the page, and when the page is edited we set all non-NULL entries to the time of the edit. This allows us to trivially find all watchlist entries where the page has/hasn't been edited since the last view: "has been edited" has wl_notificationtimestamp IS NOT NULL, while "hasn't been edited" has wl_notificationtimestamp IS NULL.

Oh, I assumed that the timestamp was the one of the last revision the user viewed! I see now that this is entirely incompatible.

Based on that information, I agree that this proposal should be declined.

Having information about when a watchlist item was added in the database however is still a reasonable with. It was indeed proposed before, as part of T100508: Watchlist expiry: Watch pages for a specified time frame (2013), which is currently stalled. The use case presented here should perhaps be mentioned on that task.

Retracted per @D3r1ck01's comment.

Another reason why NULL is used here: the timestamp is only updated the first time the page is edited. For subsequent edits, it is not updated. This way we know which edits the user has and hasn't seen, and we can offer them things like a link to the diff of all the changes they haven't seen, and markers on the history page for all new edits.

This could also be accomplished by setting the timestamp to the timestamp of the edit they last viewed, but then as @Anomie said it wouldn't be as easy to find pages that have unseen changes.

Thanks a lot everyone for your comments. Honestly, I've really learnt a lot about this table attribute and as @daniel suggested, I'll write up a proper RfC for requesting a new attribute (wl_addedtimestamp suggested by @Anomie) in the watchlist table and link the use cases in the task mentioned (T100508) in upcoming days. I'm grateful :)