Implement "actual watchers" count into MediaWiki's info action
OpenPublic

Description

Add an "active watchers" count to MediaWiki's info action.

Not quite sure how we'd do this, but it seems potentially useful. On older wikis, the number of page watchers stat can quickly become meaningless without further context (i.e., a number in a vacuum doesn't mean much). If we limited the count to "active" users (defined by having made an edit or action in the past 30 days, I suppose), it might be more helpful.

From a suggestion here: https://en.wikipedia.org/w/index.php?title=User_talk:MZMcBride&oldid=559641049#Number_of_watchers.


Version: 1.22.0
Severity: enhancement

bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz49506.
MZMcBride created this task.Via LegacyJun 12 2013, 11:29 PM
Schnark added a comment.Via ConduitJun 13 2013, 9:49 AM

Just for reference: Dispenser's toolserver tool does this, too: https://toolserver.org/~dispenser/cgi-bin/watcher.py?page=de:Wikipedia:Hauptseite

Whatamidoing-WMF added a comment.Via ConduitJul 12 2013, 3:00 AM

IMO the older the public wiki projects get, the more important this will become.

Nemo_bis added a comment.Via ConduitJan 21 2014, 8:22 AM

I was going to file this today, :) I think it's a high-impact feature which should be high priority.
I hope it will be rather easy to implement: with [[mw:Manual:$wgShowUpdatedMarker]] enabled, we know exactly what's the last revision each watching user visited.

Requirements:

  • count the "watching watchers", i.e. users with the page in watchlist who visited it in the last 30 days;
  • add the count as "Number of actual watchers" under "Number of page watchers" in action=info;
  • hide it if it's lower than 30, unless the user has "unwatchedpages" permission.

We can adjust the numbers with later bugs, possibly reusing some config if an appropriate one is found ($wgRCMaxAge is not ok, can be years if one doesn't have database constraints).

John_of_Reading added a comment.Via ConduitOct 3 2014, 8:48 PM

@Nemo - If user A is watching page B, I don't think it should matter whether user A has *visited* page B recently. What's important is whether user A is still active at the site, and therefore is likely to notice edits to page B when they show up on via the watchlist.

I have many pages in my watchlist that I don't visit, but if they show up in my watchlist I'll check their recent history using popups.

Does the software track the last time that each user displayed his/her watchlist?

Nemo_bis added a comment.Via ConduitOct 3 2014, 8:55 PM

(In reply to john_of_reading from comment #4)

@Nemo - If user A is watching page B, I don't think it should matter whether
user A has *visited* page B recently. What's important is whether user A is
still active at the site, and therefore is likely to notice edits to page B
when they show up on via the watchlist.

Sure. And I think this likelihood correlates to actual visits more.

I have many pages in my watchlist that I don't visit, but if they show up in
my watchlist I'll check their recent history using popups.

And what does this say about the extent to which you notice edits there? There is another bug about making action=history visits count as visits btw.

Does the software track the last time that each user displayed his/her
watchlist?

No.

Whatamidoing-WMF added a comment.Via ConduitOct 3 2014, 11:06 PM

The problem with counting only people who visited the page in the last 30 days is that many pages aren't edited every 30 days, and thus even though I, as an active user keenly interested in that page, will definitely see each and every change ever made to that page, possibly within minutes, there may have been no reason at all for me to visit that low-traffic, low-edit page in the last year (much less than in the last 30 days).

I've got many pages on my watchlist that average one or two edits per year. The fact that they rarely appear in my watchlist does not mean that I would not notice them being edited.

Nemo_bis added a comment.Via ConduitOct 4 2014, 6:12 AM

That's easy to fix, step 1 in comment 3 becomes "check recent unvisited edits and if there are some how old/how many they are". We could discuss what's the most sensible filter for 60 more comments but the reality is that, if done, this will be done at first with the simplest filter possible for performance reasons and then improved in later steps.

Nemo_bis awarded a token.Via WebDec 12 2014, 8:21 AM
gerritbot added a subscriber: gerritbot.Via ConduitMon, Mar 2, 4:17 PM

Change 193838 had a related patch set uploaded (by Nemo bis):
Attempt to count actual watchers in the info action

https://gerrit.wikimedia.org/r/193838

gerritbot added a project: Patch-For-Review.Via ConduitMon, Mar 2, 4:17 PM
Ricordisamoa added a subscriber: Ricordisamoa.Via WebMon, Mar 2, 4:30 PM
Nemo_bis added a comment.EditedVia WebMon, Mar 2, 4:37 PM

No other implementation was proposed yet, so I went ahead and implemented what I had mentioned above.

Given concerns above about the possibility that one doesn't visit the page despite seeing the edit summary, for now I used 6 months of "absence" from the page as threshold: someone who doesn't visit an updated page for that long is very unlikely to be following it. It would be simpler to just consider $wgRcMaxAge though, because 1) the "recent editors" count and others do the same, 2) when one waits more than $wgRcMaxAge to check an edit, that edit disappears from recent changes and watchlist and is unlikely to be seen ever.

That sort of query is very simple and on a page with about a hundred watchers it takes 600 ms on translatewiki.net (thanks Nikerabbit):

MariaDB [translatewiki_net]> explain SELECT count(*) FROM bw_watchlist WHERE wl_namespace = 0 AND wl_title = 'Support' and
                   wl_notificationtimestamp <= '20150101000000';
+------+-------------+--------------+------+-----------------+-----------------+---------+-------------+------+------------------------------------+
| id   | select_type | table        | type | possible_keys   | key             | key_len | ref         | rows | Extra
                   |
+------+-------------+--------------+------+-----------------+-----------------+---------+-------------+------+------------------------------------+
|    1 | SIMPLE      | bw_watchlist | ref  | namespace_title | namespace_title | 261     | const,const |   66 | Using index
                   condition; Using where |
+------+-------------+--------------+------+-----------------+-----------------+---------+-------------+------+------------------------------------+
1 row in set (0.00 sec)

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.