Page MenuHomePhabricator

[SPIKE] How could we implement Watchlist categorisation, tagging, or multiple Watchlists? [16H]
Closed, ResolvedPublic

Description

User story: As an editor, I want to view recent edits to specifics sets of pages of interest, so that I can focus my attention on different areas of my Wikimedia project.

In the parent ticket, and the various links and other tickets documented therein, users have expressed a desire to be able to have a way to view recent changes to specific sets of pages. This is in contrast to viewing recent edits for all pages on their watchlist - pages might have been watched for different reasons at different times. The most obvious solution to this problem would be allow users to optionally define and attach 'tags' or 'categories' to each page on their watchlist. A given page would be able to have multiple, or zero, tags. They would then have some kind of filtering system for limiting their Watchlist view to edits on pages which have a certain tag.

We would like to know if this is technically feasible to achieve within the 3 months the Moderator Tools team plans to work on Watchlist. Are there any obvious roadblocks or technical issues that we might foresee? Are there limitations, or performance concerns?

Past discussions
The following are technical discussions that we could find which have taken place before, that may be a helpful reference:

  • Meeting notes at E138#1369 for the Watchlist Expiry project contain some discussion of watchlist tags
  • T124752#1998193 onwards

It would likely be beneficial to chat with some of the folks involved in those previous conversations, in addition to the Community-Tech team, who developed Watchlist Expiry, and DBAs.

Event Timeline

Thinking about this a bit more, I suppose one consideration here is whether a page can be in multiple categories / have multiple tags, or if it must be placed in one category. I can imagine it being desirable for a page to be in multiple categories, but I imagine this increases the technical complexity, and almost certainly increases the UI complexity.

Thinking about this a bit more, I suppose one consideration here is whether a page can be in multiple categories / have multiple tags, or if it must be placed in one category. I can imagine it being desirable for a page to be in multiple categories, but I imagine this increases the technical complexity, and almost certainly increases the UI complexity.

I had a quick chat about this on Discord and we came to the quick conclusion that a page being in multiple categories is clearly beneficial, for example "Pages I created" and "Current events" could easily overlap.

I think this lends itself to a kind of tagging model, where users have one Watchlist, and the pages on it can optionally have tags that you filter on individually (this doesn't necessarily need to be an RCFilters filter, it could be tabs or any other design language). This might help with introducing this as a new feature - no ones watchlists need to change by default, but then they can start adding tags to pages they're watching.

Tangential: noting that Special:EditWatchlist has perf issues with large lists.

Scardenasmolinar renamed this task from [SPIKE] How could we implement Watchlist categorisation, tagging, or multiple Watchlists? to [SPIKE] How could we implement Watchlist categorisation, tagging, or multiple Watchlists? [16H].Apr 1 2025, 3:13 PM
Scardenasmolinar moved this task from To be estimated to Estimated on the Moderator-Tools-Team board.
DMburugu triaged this task as Medium priority.Apr 1 2025, 4:07 PM
DMburugu moved this task from Estimated to Kanban on the Moderator-Tools-Team board.

Change #1137814 had a related patch set uploaded (by Kgraessle; author: Kgraessle):

[mediawiki/core@master] [SPIKE] How could we implement Watchlist categorisation, tagging, or multiple Watchlists? [16H]

https://gerrit.wikimedia.org/r/1137814

We would like to know if this is technically feasible to achieve within the 3 months the Moderator Tools team plans to work on Watchlist.
Maybe; factors that would determine this: can we get metrics, a design, and then implementation complete for both a form to CRUD (create, read, update, delete) tags alongside the interface to filter by those tags within 3 months? If metrics are included in this it may be a stretch goal to get it done within 3 months. How long does it take to create a new table and make it available for consumption? See the process here: https://wikitech.wikimedia.org/wiki/Creating_new_tables @Ladsgroup this is probably a question you can answer.

Are there any obvious roadblocks or technical issues that we might foresee?

  • Our ability to create a new table, something like watchlist_tags which would consist of just a string tag and a FK watchlist id.
  • The UX component for selecting created tags; is a prepopulated drop-down sufficient or do we want it to be more like a selectable autocomplete?
  • Same kind of question with the form, do we have all the UX components we need to create this form to add tags?
  • Performance concerns could crop up as we're introducing a new join to the watchlist query.
  • Though we already have performance concerns due to not limiting the amount of pages a user can add or implementing some kind of pagination.
  • We would need to design and create an interface for CRUD (create, read, update, delete) operations on the tags.

Are there limitations, or performance concerns?
Limiting the amount of tags created by users. This could easily be a limit we set in the config and then check against when a user attempts to create a new tag that would push them over the limit.
As far as performance is concerned, there's no way to test this additional join in production since the added table would be a newly created one. We would need to put it behind some kind of feature flag.
Are there any current performance concerns we'd be adding to?
Yes, users can currently add too many things to their watchlist causing timeouts. This could exacerbate that. @Ladsgroup I know you cautioned us against adding any new columns to RecentChanges, are there any concerns for adding additional joins to the watchlist query? Or adding a new table titled
watchlist_tags? See spike patch here.

After talking through this in engineering weekly, it seems we're blocked because the join ultimately joins with recent changes. This would mean we cannot complete this within the 3 months the Moderator Tools team plans to work on Watchlist.

After talking through this in engineering weekly, it seems we're blocked because the join ultimately joins with recent changes. This would mean we cannot complete this within the 3 months the Moderator Tools team plans to work on Watchlist.

Refer to T307328: Scalability issues of recentchanges table for more details about the limitations.