[RFC] Expiring watch list entries
Open, NormalPublic

Description

This is a request for comments regarding implementing expiring watchlist entries in Mediawiki.

Background

This feature request features on both the German Technical wishlist from 2014 and the WMF Community Tech team wishlist at position #12 for 2015. A bug has existed for this feature since 2006 T8964.

WMF Community Wishlist proposal and votes: https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey/Watchlists#Watchlist_timed_expiry

Proposal

It should be possible to watch a page and have it be removed from your watchlist after a custom timeframe.
Users should initially be able to change the expiry time of a watchlist entry using the API.
Users should also be able to set a number of days for how long a page should be watched or make watchlist entries never expire on the Special:EditWatchlist page.

The initial implementation would not include any changes to watching a page directly from an article page (i.e. clicking the star button on the toolbar). All entries added to the watchlist by clicking the star button would still initially have an unlimited expiry date.

The feature would be first offered as a beta feature.

The initial proposal is to add a single new field to the watchlist table containing an expiry timestamp.
All selects on the watchlist table would be updated to only select watchlist entries that have not yet expired.
Users would be able to set an expiry when watching a page using action=watch specifying a parsable expiry (similar to protection expiry through the API)
The initial implementation would also have a maintenance script to remove expired entries

Backend Refactoring

The WatchedItem class currently contains methods such as doDuplicateEntries, duplicateEntries, removeWatch, addWatch, batchAddWatch, resetNotificationTimestamp & load all of which do not belong in this class. They should be moved to a WatchedItemStore or something similar.
Looking at usage of these methods if they were moved the only extension that would need updating would be Flow which uses the duplicateEntries, removeWatch & addWatch methods in production code and tests.

Methods could then be added to this store such as loadWatchedItemsForUser, loadUnwatchedItems and perhaps loadUsersWatchingPage which would remove the spread of SQL that would need to be touched by this proposal.
The SQL that would need to change is currently distributed through the following classes: InfoAction, ApiQueryInfo, ApiQueryUserInfo, ApiQueryWatchlist, ApiQueryWatchlistRaw, EmailNotification, SpecialEditWatchlist, SpecialRecentchanges, SpecialRecentchangeslinked, SpecialUnwatchedpages, SpecialWatchlist.

Applications / User stories

Expiring watchlist entries could be useful for the following reasons:

  • Watch a talk page of a user that you message for a response for a limited time
  • Watch a page for a specified amount of time after a page protection expires
  • Watch a page for a short amount of time after reverting vandalism on the page.
  • Watch a time boxed discussion page for the length of the timebox

Considerations

  • It might make sense to refactor access to watchlist items before or potentially after trying to implement this with the goal of having a single location that makes calls to the watchlist table (currently queries are spread between multiple locations)
  • The expiry field of the watchlist table may need an index
  • As mentioned in T8964 to cover further expansion to the watchlist table a more general properties field may be preferable, although this would likely mean selecting non expired watchlist items would be harder
  • It might make sense to also allow adjusting the expiry date of watchlist entry in the raw watchlist editing mode (Special:EditWatchlist/raw). One possible way to do this would be expanding the raw watchlist format so that each row could also include a number of days to watch the page. Such a change should be backwards-compatible so that importing and exporting of already existing watchlist would not require any actions from the user.

Key Questions

  • Should the expiry date be its own field or should a properties field be introduced?
  • Is a more automatic way of keeping the watchlist table clean needed or will a maintenance script do?
  • Will indexes be needed on the expiry column in order for this to scale?
  • Should the refactoring talked about actually happen and if so should it happen before or after the changes?
  • Should custom expiry date be considered in the import/export process using Special:EditWatchlist/raw), or should all imported and exported watchlist entries always have unlimited watching duration?

See also

After initial implementation

  • Api modules that output info about watched items such as ApiQueryInfo, ApiQueryWatchlist & ApiQueryWatchlistRaw should display watchlist entry expiry times
  • Some sort of automatic removal of expired items may be desired
There are a very large number of changes, so older changes are hidden. Show Older Changes
daniel moved this task from Inbox to Request IRC meeting on the ArchCom-RfC board.Jan 27 2016, 9:54 PM

Hi @Addshore, ArchCom is eager to resolve the key questions. We've tentatively scheduled an IRC discussion to cover your key questions. E138: Expiring watch list entries (RFC Meeting, 2016-02-03)

Vituzzu added a subscriber: Vituzzu.Feb 1 2016, 2:37 PM

Couldn't this be better achieved by adding a created field and managing the expiry elsewhere e.g. in a user preference?

A created date would be useful for other things especially when you start thinking about the user lists rfc (Requests for comment/Support for user-specific page lists in core) an expiry date field only has one use case...

Nemo_bis added a subscriber: Nemo_bis.EditedFeb 1 2016, 3:42 PM

to cover further expansion to the watchlist table a more general properties field may be preferable

Strong +1. It should be a generic field a bit like log_search, but in a format simple enough to be modified via a simple text area like Special:EditWatchlist/raw.

An obvious next candidate for usage of such a field would be T2738, which in its most basic solution needs storage of a string per watched page (the header name).

Couldn't this be better achieved by adding a created field and managing the expiry elsewhere e.g. in a user preference?

No. The feature request that people have demanded for years is to be able to expire *some* watched pages after a while, not an unconditional expiry.

daniel added a comment.Feb 1 2016, 3:42 PM

@Jdlrobson the idea is not to have the same expiry for all pages watched by one user. The default watch duration might be a preference. But for any user it should at least be possible to switch between "forever" and some limited period of time on demand, on a page by page basis.

daniel added a comment.Feb 1 2016, 3:45 PM

@Nemo_bis I don't understand that proposal. If we add a general purpose field, it can only be used for one thing per entry, right? Or do you imagine you could put multiple values into a single field? You can do that, but then you can no longer use it for efficient searching/filtering. Well, you can for work-by-word full text matches, but for the expiry date, we need to match anything smaller than a given value. That's only possible with a dedicated field.

@Nemo_bis after a short discussion with @daniel earlier this implementation may end up needing both an expiry field and a more general properties field (maybe in the form of a blob..)

The timestamp expiry field will probably stay to enable efficient removal of expired entries.

As we start to think about more user interaction a blob to store the initial length of the expiry for example may be useful in the case that on an edit of a watched page the user would want to reset the time the item will continue to be watched for.
Of course there are also many other use cases of such a blobby field

Luke081515 added a subscriber: Luke081515.

When a watched page was changed, and the watcher has enabled e-mail notifications, he is informed that no more notifications are sent to him unless he visits the page when logged in. He may forget to do that. If it is possible to fix the problem by allowing users to note, generally or on a per-watchlist-entry-base, that they still want to be notified, regardless, or after some time, that would be helpful. Since the problem looks similar at the surface, it might be possible to fix it in one go. Else call it a request on its own.

Anomie added a subscriber: Anomie.Feb 1 2016, 5:52 PM

Backend Refactoring

I like the idea of backend refactoring. I'm glad https://gerrit.wikimedia.org/r/#/c/266524/ seems to have moved away on trying to use TitleValue.

Should the expiry date be its own field or should a properties field be introduced?

For expiry to actually be workable, I can't see how we'd be able to not have a dedicated field for it. A generic "properties" field is ok for things that aren't going to be used as filters in the SQL query, but this needs to be mainly so it doesn't take a full table scan to clean up expired rows or (lacking cleanup) so the watchlist+recentchanges join doesn't have to select thousands of expired rows to filter them client-side.

Is a more automatic way of keeping the watchlist table clean needed or will a maintenance script do?

Does MediaWiki have built-in periodic jobs, or would people need to wire the maintenance script up to cron to avoid cruft building up?

Will indexes be needed on the expiry column in order for this to scale?

I'd think so. Chances are the maintenance script will need one to be able to do its job halfway efficiently, and depending on how many expired entries are allowed to built it might not hurt to add the field to some of the other indexes that are being used by the modified queries.

Should the refactoring talked about actually happen and if so should it happen before or after the changes?

Probably, yes. I'd go for before.

Should custom expiry date be considered in the import/export process using Special:EditWatchlist/raw), or should all imported and exported watchlist entries always have unlimited watching duration?

Ideally yes, but that might take some extra thought to do well considering the existing interface on that special page.

So the latest patch set on https://gerrit.wikimedia.org/r/#/c/245881/ takes care of many things discussed so far, indexes are added, jobs has been created for purging old entries.

As for re factoring I have started separately. It always makes sense to see code so:

The second step will break a few extensions in very minor ways, see: https://gerrit.wikimedia.org/r/#/c/267682/ https://gerrit.wikimedia.org/r/#/c/267259/ https://gerrit.wikimedia.org/r/#/c/267259/
And that would, as far as I can tell, be the only changes needed to extensions in Gerrit. Also migration for other users not in Gerrit will be trivial.

Further re factoring would move more logic and DB stuff out of special pages and API modules into this WatchedItemStore thing in a similar style to https://gerrit.wikimedia.org/r/#/c/266572/

@Jdlrobson the idea is not to have the same expiry for all pages watched by one user. The default watch duration might be a preference. But for any user it should at least be possible to switch between "forever" and some limited period of time on demand, on a page by page basis.

I think this assumption is made on the fact that currently it is only possible to have one watchlist. In an ideal world I'd be able to define multiple lists with different expiry times. I think having expiry dates on a per page basis is wasteful and would much prefer to see a last modified/created by timestamp on those entries which has far more value and enables many possible use cases. I feel very strongly about this.

@Jdlrobson the idea is not to have the same expiry for all pages watched by one user. The default watch duration might be a preference. But for any user it should at least be possible to switch between "forever" and some limited period of time on demand, on a page by page basis.

I think this assumption is made on the fact that currently it is only possible to have one watchlist. In an ideal world I'd be able to define multiple lists with different expiry times. I think having expiry dates on a per page basis is wasteful and would much prefer to see a last modified/created by timestamp on those entries which has far more value and enables many possible use cases. I feel very strongly about this.

This should probably be opened as another ticket

Deskana added a subscriber: Deskana.Feb 1 2016, 6:46 PM

@Addshore Could you (or someone else) add some user stories for this task? The proposal is quite clear in explaining the requested functionality, but what is not clear the why. It would be good if you could expand on this. Thank you!

@Addshore Could you (or someone else) add some user stories for this task? The proposal is quite clear in explaining the requested functionality, but what is not clear the why. It would be good if you could expand on this. Thank you!

The general applications for the expiring watch list entries are covered in the Applications section in the description of this task.
If you would like more formal stories I could probably clean them up (and possibly hunt down a few more)

@Addshore Could you (or someone else) add some user stories for this task? The proposal is quite clear in explaining the requested functionality, but what is not clear the why. It would be good if you could expand on this. Thank you!

The general applications for the expiring watch list entries are covered in the Applications section in the description of this task.
If you would like more formal stories I could probably clean them up (and possibly hunt down a few more)

I'm sorry, I missed that section! I guess my eyes glazed over when I saw the backend refactoring section. :-p

I think what you've written here is probably sufficient to communicate your point. Thanks. :-)

I'm pretty annoyed that T8964 was closed, particularly in such a way that I'm not subscribed to this task.

Applications

Expiring watchlist entries could be useful for the following reasons:

  • Watch a talk page of a user that you message for a response for a limited time

I think an expiry field is a really hackish way to implement this functionality. We already have Echo notifications so that you can ping a user. We should one day have a proper discussion system such as Flow. As a user, I care about seeing/watching replies to a discussion. These replies might come in a week or in a month, but I still want to see the replies.

  • Watch a page for a specified amount of time after a page protection expires
  • Watch a page for a short amount of time after reverting vandalism on the page.
  • Watch a time boxed discussion page for the length of the timebox

I'm not sure about using an expiry timestamp field as a solution to these problems. A lot of use-cases would be solved by knowing:

  • when a page was added to a list; and
  • why that page was added to the list.

When is solved not by storing expiry time, but by storing time of insertion into the list.

Why is probably solved by a semi-automated tagging or keywords system of some kind. Knowing that I'm watching a page because I hit rollback on it versus knowing that I'm watching a page because I added it via a raw watchlist edit versus knowing that I'm watching a page because I protected it in August 2013 is useful info that lets me, as the user, manage my watchlist. A variable expiration date on watchlist entries, by comparison, is a lot less flexible/useful.

I agree with @Jdlrobson that watchlists should be treated as a subset of a larger page lists system. User watchlists are a particular kind of page list: read-restricted and write-restricted. We need to move toward infinity in the zero–one–infinity paradigm so that we can support as many page lists as the user wants.

Architecturally, for MediaWiki core, if we're going to enhance and expand watchlists, I think the focus should be on refactoring the back-end to support more lists.

Most of the usability concerns here would probably be addressed by making watchlist items easier to remove. This is essentially what Facebook does in its feeds (there's a cog or arrow next to each item that lets you unsubscribe or unfollow or block or whatever). Watchlists likely need similar functionality.

There are also lessons to be learned from sites such as LinkedIn, where you have a network of people. LinkedIn stores when the connection was made and allows you to annotate the connection with private notes. We basically want equivalent functionality for MediaWiki page lists.

I worry quite a bit about building a sane UI for this. The watchlist system is already dreadfully over-loaded and poorly integrated with different bits, and this would not make it better…

I'm pretty annoyed that T8964 was closed, particularly in such a way that I'm not subscribed to this task.

It looks like you were subscribed to the task it was closed as a duplicate of however (T100508)

I think an expiry field is a really hackish way to implement this functionality.

As far as I can tell an expiry field is in-fact the only sane way to implement the functionality desired in the wish.

I agree with @Jdlrobson that watchlists should be treated as a subset of a larger page lists system. User watchlists are a particular kind of page list: read-restricted and write-restricted. We need to move toward infinity in the zero–one–infinity paradigm so that we can support as many page lists as the user wants.

It sounds like you are talking about something along the lines of T3492 and T9467

There are also lessons to be learned from sites such as LinkedIn, where you have a network of people. LinkedIn stores when the connection was made and allows you to annotate the connection with private notes. We basically want equivalent functionality for MediaWiki page lists.

Are there any tickets open about annotating watch lists?

Of course, adding an expiry field now would not stop future work on a params field to store arbitrary annotations / tags / creation dates.
It also doesn't feel as if it would get in the way of an work on multiple page lists or public page lists.

@Jdlrobson you wrote

In an ideal world I'd be able to define multiple lists with different expiry times.

On the database side, having multiple watchlists means adding a "list name" field to the watchlist table. That's orthogonal to adding an expiry column, and the two combine naturally. I don't see how the number of watchlists inpacts the question of if and how we manage automatic expiry.

I think having expiry dates on a per page basis is wasteful

Even if you define the expiry period per watchlist, you still need to track the expiry timestamp for each page. So on the database side, it doesn't matter when and where you define for how long a page should be watched. That's purely a UI question.

and would much prefer to see a last modified/created by timestamp on those entries which has far more value and enables many possible use cases.

Last modified is easy, it's already there. Created is harder, but could be done easily enough by writing page creations to the logging table (why don't we?). No extra stuff in the DB needed, and you get the user, the timestamp, and any metadata you want. But all that is completely unrelated to the watchlist. File a feature request ;)

Addshore updated the task description. (Show Details)Feb 2 2016, 12:26 PM

From https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-02-03-22.00.log.html.

Architecturally, for MediaWiki core, if we're going to enhance and expand watchlists, I think the focus should be on refactoring the back-end to support more lists.

@Platonides seems to agree, saying "while changing the watchlist, it may make sense to fully remake it and allow multiple watchlists."

Most of the usability concerns here would probably be addressed by making watchlist items easier to remove. This is essentially what Facebook does in its feeds (there's a cog or arrow next to each item that lets you unsubscribe or unfollow or block or whatever). Watchlists likely need similar functionality.

@daniel seems to agree, saying "i like the 'magic is bad' argument. pretty convincing. stuff shouldn't just vanish. it should just be easier to clean up."

Should this ticket now be closed? (As this is just the RFC)?

@Addshore no reason to close this after the first feedback session. I expect there will be another round. I'll move the ticket to "under discussion" on the board.

Addshore added a comment.EditedFeb 4 2016, 4:01 PM

Okay, so modifications to the proposal, and a possible order of attack:

  1. Refactoring...
  2. T125990 Add wl_id to the watchlist table, it looks like this will be useful in all possible outcomes
  3. T125991 Add wl_timestamp to the watchlist table. There have been many requests for this (or something similar to this) and this will be needed with the eventual goal of easy expiring of watchlist items.
    • It should be decided if this should be a timestamp of when the item was added to the watchlist or if this is the timestamp the watchlist item was last watched / touched? This will be important when tags are possible? When I add a tag should the timestamp stay the same or change? Page moves should also be considered.
  1. Add a watchlist properties table with the fields (wl_id, wlp_property, wlp_value). This would be used to store tags and possibly expiry times. It may also make sense to have a separate table solely for holding expiry times when set and thus the watchlist properties table could solely become a tags table.
    • Due to the possible size of this table with for example 139 million watchlist items in enwiki the number of tags per item should be restricted (perhaps 3 or 4)
    • The string length of a tag should also be limited.
    • It should be decided about the case sensitivity of tag in general, or perhaps a list of pre defined tags should be implemented, or a user should have to define a tag before using it (I think these last 2 points are a bad idea)
    • As said it would be possible to store expiry times in this table. (We can call them whatever we want, but right no expiry still seems right). As both an expiry and a tag have a limited length the value filed can be a VARCHAR and thus well indexed and the PK for the table made of all 3 fields.
  1. Once all of the above has been done this can be exposed to users in a slow and controlled way.
    • With expiring watch list entries this does not necessarily have to be automatic expiry any more. Instead a watchlist option should be only show me things in my watchlist that have not passed the date I said to hide them. A user would also be provided with an easy way to remove anything that had passed this date. In my opinion this then still hits all stories linked to T100508

This touches or fixes issues including T8964, T67187, T6354

In the far future what is discussed in T1352, T3492 would still be possible.
Of course with tags it would be technically possible to create multiple lists from a single watchlist (filter on tags).
Thus public lists etc may be considered a separate feature.

As there was a lot of push back from the main goal and usecase described in the initial RFC which is covered by the stories linked to T100508 it may also be possible to do the expiry things in an extension!

Anomie added a comment.Feb 4 2016, 4:26 PM
  • Due to the possible size of this table with for example 139 million watchlist items in enwiki the number of tags per item should be restricted (perhaps 3 or 4)

That seems a bit too restrictive, IMO.

In the far future what is discussed in T1352, T3492 would still be possible.
Of course with tags it would be technically possible to create multiple lists from a single watchlist (filter on tags).

I wonder whether tags would really be the best implementation of that, versus actually having multiple lists. I'd think that selecting the list IDs then the items in those list would be more efficient than selecting everything in the one list and filtering by tags; the disadvantage would be that having a page in multiple lists means recording the page namespace and title multiple times in the list-items table. The having separate props per list (if props are still useful) could be an advantage or a disadvantage.

daniel added a comment.Feb 4 2016, 6:23 PM

@Anomie do you mean to use separate database tables for the different lists? So all users would have the same lists available?

It would also make creating a new watchlist a database level maintenance job.

Anomie added a comment.Feb 4 2016, 6:41 PM

@Anomie do you mean to use separate database tables for the different lists? So all users would have the same lists available?

It would also make creating a new watchlist a database level maintenance job.

No, I was meaning a table of lists with columns for owner, list name, and so on,[1] and then the (probably differently named) watchlist table would refer to that table's ID instead of having a wl_user column.

[1]: I don't know whether it would be better to have a column in the lists table to indicate "this is the user's default watchlist", and other role-lists that people might come up with in the future, or if we'd want to store that separately so one list could have multiple roles.

Per my mail to wikitech-l ("Using assignees for RFC shepherd"), I'm going to be bold and assign this to Daniel as the shepherd for this task. @daniel, thanks for moving it to the proper column on the board!

Addshore added a subscriber: jcrespo.EditedFeb 5 2016, 4:19 PM

I have created T125990 and T125991 which I extracted from my comment above to be actively worked toward right now for adding an id field and a timestamp field to the watchlist table.

My main question right now are what are peoples thoughts on having a single table called something like watchlist_properties or two tables, one called watchlist_tags and one watchlist_expiries as discussed in point 4 of T124752#1998193
Having separate tables would no doubt decrease the DB stress when for example asking for all items tagged with X or asking for all items I wanted to get rid of after Y, and I don't really see this introducing much complexity.
@jcrespo, I saw you in the meeting but not in this ticket!

In the far future what is discussed in T1352, T3492 would still be possible.
Of course with tags it would be technically possible to create multiple lists from a single watchlist (filter on tags).

I wonder whether tags would really be the best implementation of that, versus actually having multiple lists. I'd think that selecting the list IDs then the items in those list would be more efficient than selecting everything in the one list and filtering by tags; the disadvantage would be that having a page in multiple lists means recording the page namespace and title multiple times in the list-items table. The having separate props per list (if props are still useful) could be an advantage or a disadvantage.

Indeed, we need not allow users to be able to filter by tags, but allowing them to do so allows easy cleaning of a watchlist. ie. removing all things in the list that were added over a year ago that were automatically added.
Tags and multiple page lists I expect can work side by side as with many of the other things discussed earlier.

No, I was meaning a table of lists with columns for owner, list name, and so on,[1] and then the (probably differently named) watchlist table would refer to that table's ID instead of having a wl_user column.

[1]: I don't know whether it would be better to have a column in the lists table to indicate "this is the user's default watchlist", and other role-lists that people might come up with in the future, or if we'd want to store that separately so one list could have multiple roles.

Again I have no doubt this will be possible after other things have been implemented.
I'm not trying to suggest that tags for watchlist items replace the idea of multiple lists of pages.
We are after all trying first to solve T8964 & T100508 here not T3492 (although we should make sure we are not stopping the idea in any way)

Per my mail to wikitech-l ("Using assignees for RFC shepherd"), I'm going to be bold and assign this to Daniel as the shepherd for this task. @daniel, thanks for moving it to the proper column on the board!

Thanks!

I'm a bit worried that this effort has drifted significantly from the original use case. In order to expire watchlist items, all we need is a timestamp field and an expiration field. I'm sure that watchlist_props (or watchlist_tags) is a great idea that will solve lots of bugs and feature requests, but I don't think it has anything to do with this RFC, the original bug, or the Community Wishlist request. Why would the obvious and simple solution not be the correct one here?

I'm a bit worried that this effort has drifted significantly from the original use case. In order to expire watchlist items, all we need is a timestamp field and an expiration field. I'm sure that watchlist_props (or watchlist_tags) is a great idea that will solve lots of bugs and feature requests, but I don't think it has anything to do with this RFC, the original bug, or the Community Wishlist request. Why would the obvious and simple solution not be the correct one here?

I do see what you mean.
I think in the initial RFC discussion the general feeling was that users did not actually want to set an expiry date for individual watchlist items but instead just making it easier to remove old items and keep a watchlist in check.

After more thinking and discussion with people at the hackathon I think the long term route forward should roughly be this, and of course all comments welcome.

  1. wl_id (already in progress)
  2. Refactoring (already in progress)
  3. watchlist_props table as described in T124752#1998193 (which would allow us to store an expiry time stamp). The benefit of having this in a separate table is that not all watch list items would have an expiry.
  4. watchlist_tags table (I feel this is better than multiple watchlists but essentially allows you to group watched items into separate lists.
  5. wl_timestamp field on the watchlist table (for cases where people don't want to set an expiry and instead want to remove all items that have been in their watchlist for X weeks)
  1. wl_id (already in progress)
  2. Refactoring (already in progress)
  3. watchlist_props table as described in T124752#1998193 (which would allow us to store an expiry time stamp). The benefit of having this in a separate table is that not all watch list items would have an expiry.

See T129486. This is IMO not blocked on anything right now. So we could go on with that, right?

  1. watchlist_tags table (I feel this is better than multiple watchlists but essentially allows you to group watched items into separate lists.
  2. wl_timestamp field on the watchlist table (for cases where people don't want to set an expiry and instead want to remove all items that have been in their watchlist for X weeks)

See T125991.

  1. watchlist_props table as described in T124752#1998193 (which would allow us to store an expiry time stamp). The benefit of having this in a separate table is that not all watch list items would have an expiry.

See T129486. This is IMO not blocked on anything right now. So we could go on with that, right?

Indeed

@Tobi_WMDE_SW @Addshore: I think this has to by discussed with product people at the WMF. OIne major concern that was raised during the discussion is that "expiry" is a concept we only have for blocks and page protection. It's problematic to have something happen to a user's watchlist without notice - things just disappear with no trace.

Additionally, it's unclear when, where, and how expiry for individual watchlist items can be defined or edited. The "bulk edit" mode for watchlists is particularly problematic.

Basically, the technical side of making watchlist entries expire is not hard. But we don't have good user stories for some of the less obvious use cases. And UI and UX are largely undefined. It seems to me like we first need to re-iterate with product managers and the community, decide on stories and requirements, and then decide on the technical solution.

@daniel please see T100508#2014479 for a clear user story, simple UI and UX idea without the expiry issue you are worried about.

It seems to me like we first need to re-iterate with product managers and the community, decide on stories and requirements, and then decide on the technical solution.

Sounds great.

The user story at T100508#2014479 is just one specific case (and I don't believe it is the most common case). It's a good starting point, but we need more user stories and more UI ideas.

Here is the user story that I would personally like to see supported:
As a vandalism fixer, I would like to add a page to my watchlist for 1 week immediately after I have reverted vandalism to it in order to make sure a page is not re-vandalized. I do not, however, have any long-term interest in the page. I would like to be able to do this without leaving the article itself, i.e. without visiting my watchlist page or watchlist editing interface.

As a vandalism fixer, I would like to add a page to my watchlist for 1 week immediately after I have reverted vandalism to it in order to make sure a page is not re-vandalized. I do not, however, have any long-term interest in the page. I would like to be able to do this without leaving the article itself, i.e. without visiting my watchlist page or watchlist editing interface.

And with a ll of the other stories that have been requested throughout the discussion of this RFC all of the points in T124752#2209149 are needed.

  1. wl_id field - Being able to efficiently clear / maintain a watchlist
  2. watchlist_props table - Being able to expire an item after a given amount of time
  3. watchlist_tags table - Multiple watchlists / being able to tag items
  4. watchlist_timestamp field - Being able to remove really old items from a watchlist
  1. wl_id field - Being able to efficiently clear / maintain a watchlist
  2. watchlist_props table - Being able to expire an item after a given amount of time
  3. watchlist_tags table - Multiple watchlists / being able to tag items
  4. watchlist_timestamp field - Being able to remove really old items from a watchlist

"tags" don't exactly give you multiple watchlists, though. They just let you filter your one watchlist. Consider these multiple-watchlist stories, for example:

  • A user wants to put page Example onto their "Actively watch for repeat vandalism" list with expiry one week, "Check occasionally" with expiry two months, and "I should read this someday" with no expiry.
  • A user has had page Example on their "I should read this someday" list for years. Then they add it to their "New research project" list today, because it's relevant to the new project they're starting. They want the date-added timestamp to be correct for each list.

"tags" don't exactly give you multiple watchlists, though. They just let you filter your one watchlist. Consider these multiple-watchlist stories, for example:

  • A user wants to put page Example onto their "Actively watch for repeat vandalism" list with expiry one week, "Check occasionally" with expiry two months, and "I should read this someday" with no expiry.
  • A user has had page Example on their "I should read this someday" list for years. Then they add it to their "New research project" list today, because it's relevant to the new project they're starting. They want the date-added timestamp to be correct for each list.

Indeed, so:

  1. add a table containing list information (id, name, creation_date, etc.) and adding a list_id field to the watchlist table.

But as @kaldari said we are really getting away from the point of this RFC / the wish!

DannyH updated the task description. (Show Details)Apr 18 2016, 11:29 PM

Yes, to clarify: this work is being done because of requests from the community, made in the 2014 German Community Wishlist and the 2015 WMF Community Wishlist. Here's the proposal from 2015, contributed by User:Derek Andrews:

"I would like to be able to set an expiry time for watchlist items, of say one week or one month. There are many pages that I do maintenance on or repair vandalism that I would like to watch for a brief period of time, but have no long term interest in. The UI I envisage would just have additional tick boxes: watch this page indefinitely, watch for one week; watch for one month."

The proposal got 18 endorsements and 55 support votes, making it the #12 most-supported wish in the Wishlist Survey. You can see the votes and enthusiastic comments here:

https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey/Watchlists#Watchlist_timed_expiry

So I'm puzzled by the idea that we need more/better user stories than the one already provided, discussed and overwhelmingly approved in both the German and WMF surveys.

Discussed @Addshore's proposed DB plan. @daniel brought up the point that we probably wouldn't need a watchlist_tags table since the watchlist_props table could also be used for tagging. This is similar to how the page_props table is commonly used for tagging with the value field just set to 1 or empty string and the propname field representing the tag.

I just realized the minutes from the IRC meeting are not on this ticket, but hidden in E138. Here they are now:

  • '''Expiring watch list entries | RFC meeting | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/''' (TimStarling, 22:01:04)
    • ''LINK:'' https://phabricator.wikimedia.org/T124752 (TimStarling, 22:01:11)
    • re: expiry date: addshore believes it's basically been answered, though a properties field might be useful further down the line (robla, 22:05:34)
    • question discussed: is this solving the problem at the right level of generality? (robla, 22:07:24)
    • : question discussed: what sort of database/maintenance overhead does this impose? will this require a maintenance script? (robla, 22:12:47)
    • addshore> So, the way the expiry is done in the patch is taken from the protection api currently (which also has expiries) (robla, 22:25:41)
    • question discussed: how quickly does expiry-based watchlist purging need to happen? does the feature need to rely on purging to work? (robla, 22:28:02)
    • questions discussed: is full watchlist cleanup automation required? would tags be helpful? do many people add all pages they edit to their watchlists? (robla, 22:44:32)
    • question: do we want an expiration date, or a watched-since timestamp? (DanielK_WMDE, 22:45:58)
    • <addshore> I guess with a combination of watched since and maybe a tag of number of days to expiry would actually work for most of the main cases for the expiry field (DanielK_WMDE, 22:54:01)

The full log can be found here: https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-02-03-22.00.log.html

@DannyH @kaldari @Addshore @daniel @Bmueller I think we should add the notes and outcomes of our meeting from last Monday as well.