Page MenuHomePhabricator

Basic XML Feed support for watchlist (implementation idea: token)
Closed, ResolvedPublic

Description

Author: foenyx

Description:
Add a basic XML Feed support for the user's watching list.
Each item of the user watching list become an Item of the feed, with the date,
the author and the url.


Version: unspecified
Severity: enhancement
URL: http://meta.wikimedia.org/wiki/Syndication_feeds

Details

Reference
bz471

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 7:08 PM
bzimport set Reference to bz471.
bzimport added a subscriber: Unknown Object (MLST).

foenyx wrote:

a basic implementation of the XML Feed for watch list

It's my first work on mediawiki, so I guess it's not bug-free. But it worked
quite well in my local mediawiki.

The main limitation is that user need to be logged into his account to access
it, but if the feed agreggator is included with the browser it should work.

Attached:

The tricky part with the watchlist is authentication; this patch doesn't seem to
address that yet.

  • Bug 1265 has been marked as a duplicate of this bug. ***
  • Bug 1303 has been marked as a duplicate of this bug. ***

zigger wrote:

*** Bug 2120 has been marked as a duplicate of this bug. ***

  • Bug 2293 has been marked as a duplicate of this bug. ***

Support; I like the idea.

The following amendment would be very useful and shorten the RSS feed:

I propose to have a special RSS FEED for only those items in one's watchlist,
which have UNSEEN changes. These are those changed (watched) pages, for which an
enotif have been sent to the watching user; the information is already available
in the Enotif versions ( 1.5 ) in the table watchlist - for each user.

rowan.collins wrote:

Just for ease of reference:

  • One previous discussion

http://mail.wikipedia.org/pipermail/wikitech-l/2004-December/thread.html#26558

  • Previous discussions have generally come to the conclusion that the most

widely usable solution would be to let the user opt in in preferences to make
their watchlist public, and then generate a secret random token that has to go
in the URL to view it. (Noting that this provides only imperfect protection,
since the random string will be transmitted "in the clear" through any number of
proxies, logs, etc, and of course stored by any aggregator services used) This
way all RSS readers will be able to view the watchlist, people who don't use RSS
can have their watchlist completely private, and even people who *do* can have
it "mostly private".

triddle wrote:

Perhaps it could be done both ways; they are not mutually exclusive.

rowan.collins wrote:

(In reply to comment #9)

Perhaps it could be done both ways; they are not mutually exclusive.

Um, both of which ways? Do you mean Foenyx's "hope they already have a cookie"
approach and my "opt in and have a pseudo-secret URL" approach? Yes, I guess you
could easily have both, although I do think the former is likely to be somewhat
less useful.

triddle wrote:

Yes, that is both of the ways I was refering to. Since the pesudo-secret URL isn't as graceful (but obviously is one of
the few ways to get non-cookie auth to work) as a browser with built in RSS abilities it seems wise to support both.
Safari and the new IE support RSS; I don't think its a stretch to assume that Mozilla and the other browsers will be
following suit soon. Is it a good idea to short-change those browsers to make up for the defficiencies in other RSS
readers? Granted the ability to view some random person's watchlist contents is of little value for nefarious activities
but is it hard to support them both? The watchlist code will have to modified to output RSS or HTML anyway; the
pseudo-secret URL is simply another authentication mechanism that is localized to readonly access to the RSS
watchlist. It seems to me that once you solve the pseudo-secret URL you already solved the RSS cookie problem to the
majority extent.

Tyler

rowan.collins wrote:

Ah, fair enough. Just to be clear, Firefox supports RSS feeds out-of-the-box
too; there's probably extensions for Mozilla Suite, and who knows what the new
SeaMonkey will end up with. But anyway, the sense in which that approach is kind
of hacky is that it's not really a "deficiency in other RSS readers" - they're
not web browsers, so they don't support rendering and submitting an HTML form
(currently the only way of logging in). Who knows whether or not they'd support
cookies in general, but the question is how to do the authentication in the
first place.

The point of the pseudo-secret URL is that you don't ever need to go through a
username and password challenge inside the RSS reader - if your reader's a
browser, you can just use the existing login form fine (as Foenyx's patch
already does) so you don't have to do anything special beyond rendering the XML.

But, to get back to the point - no, I see no reason not to make it so that if
the Watchlist-feed is requested without the magic URL parameter, it sees if
there's a login cookie set and shows the logged in user's watchlist, just like
the non-feed version would. But it would be nice to have the more flexible
version ready before we enable it, so that it can be available to everyone.

(In reply to Brion's comment #2)

The tricky part with the watchlist is authentication

Implementation idea:
Brion, perhaps you can easily re-use some of the lines you wrote for
"Confirmemail", I mean the token-method:

  • any user can request a token (rss/rdf-link incl. the token) to be sent to his

e-mail address

  • link example:

http://server/wike/index.php/Special:Watchlist/feed=rss/unseen/e45e0e1a1a2148b5719437c00d92fe41

Only watchlisted page titles, page edit time and edit summaries of pages having
unseen changes (*) are then delivered as an RSS feed.
(*) these are the bolded page titles in watchlists of mediawiki >= 1.5 versions

How do you like that ?

avarab wrote:

I don't see why email addresses need to get involved, just displaying the token
in the user preferences should be enough.

rowan.collins wrote:

Discussion about this has broken out on the mailing list again; future readers
should read this section of the archive:
http://mail.wikimedia.org/pipermail/wikitech-l/2005-September/thread.html#31577
[And please *do* read the previous discussions, to avoid raising points which
have already been discussed]

(In reply to Rowan's comment #14)

I don't see why email addresses need to get involved, just displaying the token
in the user preferences should be enough.

Yes. of course, there is no need to _mail_ a pseudo-secret RSS feed link. It
will be sufficient and handy to show the pseudo-secret key for example on the
Special:Watchlist of the current user.

Example:
http://server/wike/index.php/Special:Watchlist/feed=rss/unseen/e45e0e1a1a2148b5719437c00d92fe413428124566

Rowan: I have this on my to-do list.

mike wrote:

I've reworked http://meta.wikimedia.org/wiki/Syndication_feeds with the hopes
further rallying toward implementation. This bug is linked along with other
existing conversation. Let me know if there is any other way I can help ;)

  • Bug 3167 has been marked as a duplicate of this bug. ***

http://meta.wikimedia.org/wiki/Syndication_feeds --> very useful overview page,
thanks. I have added a link on my Enotif "Wishlist" page which is
http://meta.wikimedia.org/wiki/Email_notification_to-do_list

Wikinaut

Is anyone working on an implementation of that ? Would be nice to have....

  • Bug 5188 has been marked as a duplicate of this bug. ***

zach wrote:

Erm yes, I really should have posted the patch here rather then file a new bug.
Terribly sorry about that...

All: I posted a new patch and discussed some thoughts about this at bug 5188.
I'd really like to see this feature happen, so I'd appreciate your comments.

adziura+wiki wrote:

I think that will be very useful.
Gmail have a feed (Atom) with new mails for users. And private is saved.

Here, here! For pseudo-secret URL :). That would be cool. The URL could be
changed monthly or on demand. to give more security.

  • Bug 6370 has been marked as a duplicate of this bug. ***

Or the URL could have a user-specifiable password, SEPERATE from the main
password, such as:

http://server/wiki/Special:Watchlist/User?watchpw=secret&feed=atom

(If the user has a login cookie, the following should also be able to be used:)

http://server/wiki/Special:Watchlist?feed=atom

Similar for rss.

jay.corrales wrote:

I would like to propose an out-of-the-box idea.

What about having the RSS reader do all the work? When someone wants to "watch
a page", why couldn't they just add the history page as an RSS link? Would it
be easier to have the wiki software create an RSS-format history page for every
entry, maybe this can also be what the wiki uses to display the history page, or
maybe it can just be a mirror page with less information and a link to the edits?

I suppose going further, you could have your own RSS feeder that aggregates all
your watched wikipedia entries along with your other feeds. Most RSS feeders
also allow you to categorize your feeds (such as bloglines) so you could make a
wikipedia category in which you could add/subtract your wikipedia history feeds.

I am mainly interested in the feature for watching articles on Wikipedia, but I
figure this may also be helpful for other wikis.

This is already available - you can watch like that every single article when
you click on it's history. But having hundreds of channels in RSS reader... I
think I would have to pass that - beside to much trouble of adding them anyway...

But it would be cool if an RSS feed would be available for linked articles -
this way I could watch whole categories with RSS, which would be really cool :).

logixoul wrote:

(In reply to comment #28)

This is already available - you can watch like that every single article when
you click on it's history. But having hundreds of channels in RSS reader...

... I have no problems with that, and I don't see how anybody would. So I'm in favor of marking this bug INVALID.

conti wrote:

(In reply to comment #29)

... I have no problems with that, and I don't see how anybody would. So I'm in

favor of marking this bug INVALID.
When you have about 300 articles in your Watchlist (nothing unusual for alot of
Wikipedians), you have to add 600 (articles + talk pages) feeds, that's not a very
good alternative solution. So I'd still like to see this feature.

jay.corrales wrote:

I guess it would come down to the efficiency of being able to add a feed to your
news feeder and if you news feeder allows you to organize links. Now that you
mention it, I don't think newsfeeders are really there yet.

If it were as simple as the click of a button to add the feed, that would be
great, but then wikipedia would have to add all the different types of newsfeed
buttons to make that work. Copying the link and adding it to an add news feed
dialog box or web form is a bit more of a pain.

If the news feeder allows organization of rss feeds into groups or folders, you
could group all the feeds into a category named Wikipedia or Watchlists, but I
agree that it would be a pain if you are watching a bunch of articles and your
newsfeeder doesn't allow categories or if it is cumbersome to add them, which I
think describes most.

If wikipedia adds newsfeed functionality, it would certainly make it easier for
a wikipedian if they can just add one feed named "Wikipedia Watchlist".
Wouldn't it also cut down on bandwidth use? If a wikipedian loads a feed
instead of the whole website to check their watchlist, I would think it would
cut down on load.

There could also be links in the RSS feed directly to the article, which would
make it so the user wouldn't have to load the main page and type in the article
name.

wmahan_04 wrote:

Implementation

I implemented this as described in comment #8.
Cookie-based authentication as discussed in comment #11
is also supported.

With the patch, when $wgSyndicateWatchlists is true,
users who enable an option in their preferences can
share their watchlists using pseudo-secret URLs like:

http://wiki.example.com/wiki/index.php?title=Special:Watchlist&feed=rss&user=Wmahan&token=906d1d49f1f624775c33d033038181fe&limit=250

Once created, a watchlist token never expires but may
be disabled or reset to a new pseudorandom value using
the preferences interface.

Feeds showing only unseen changes as proposed in
comment #7 are supported; append "&unseen=1" to the
URL.

Notes:

  • The patch adds a column to the "user" table called

"user_watchlist_token" to store the user's
pseudo-secret token.

  • At the moment the feeds do not support the standard

watchlist filtering options (show edits in the last X
days, hide minor edits, etc.). That wouldn't be too
hard to add, but I think feed aggregators can probably
provide some of the same functionality.

  • The feeds are not cached; as I suggested at bug 4182,

that could be done in Feed.php to reduce duplicated
code.

  • I started this before realizing there is already a

patch at bug 5188. Unlike that one, this doesn't
create a new special page, and it doesn't add a DB
column and other infrastructure to rate-limit requests;
I was planning on handling that by caching.

attachment 471.diff ignored as obsolete

wmahan_04 wrote:

Implementation, including schema change

Attached:

dto wrote:

*** Bug 7316 has been marked as a duplicate of this bug. ***

rowan.collins wrote:

I notice that the current work on an API includes a working implementation of
watchlist-as-rss, using login session tokens from cookies or an explicit login.
See http://meta.wikimedia.org/wiki/API and check it out at
http://en.wikipedia.org/w/api.php?action=feedwatchlist

Obviously, this doesn't solve the central issue under discussion here, of how to
have a stateless authentication for that one action, but it might be worth
considering if there's merit in implementing it as part of that API rather than
through the standard access route...

regarding Wil's solution : can it be applied to MediaWiki proper? It would go best with a side of warm milk and a section of the user preferences (or MediaWiki: namespace strings) to define what sort of/how much diff and editor information to include in the rss update. One nice thing about a proper feed is that there is room to include more information than the single line the watchlist gives.

Wiki.Melancholie wrote:

You might be interested in manually making a watchlist page that will have a XML feed then!

See https://bugzilla.wikimedia.org/show_bug.cgi?id=5220#c21 - bug 5220 - for that.

Removing bot-interface keyword as the API now provides a bot interface for watchlists.

Changing component to "Watchlist"

ezyang wrote:

*** Bug 18570 has been marked as a duplicate of this bug. ***

ezyang wrote:

Patch is three years old and doesn't apply anymore, removing patch status.

ezyang wrote:

Initial implementation comments:

  • There has never been a syndicated Special page in MediaWiki's codebase before, and the existing code assumes that the corresponding feed is accessible at Special:Watchlist?feed=rss, which is not the case (you have to use the feedwatchlist API URL). There are two methods of going about solving this: one is generating a fake request from Special:Watchlist code to feedwatchlist and returning the code directly, and using the regular syndication link calculation code, and the other is special-casing Special:Watchlist in OutputPage. I would prefer the former, but both are fairly hacky.
  • My preferred implementation approach is to assume that the user is using a feedreader in their browser. If an unauthenticated user hits the RSS feed, we publish an item explaining to them that they are not logged in, and give them instructions on how to enable the "public" (should be phrased carefully) feed that they can directly give to their feedreader. This means that the discovery cost is minimal: a user can use the usual mechanism for subscribing to a feed, without having to have had twiddled a preference beforehand.
  • The token to be used can be cast as either a watchlist token, or a read token: that is, a token that can be used to read any private data on MediaWiki (which is really just watchlist). I prefer the latter.

ezyang wrote:

convert watchlist to class

This initial patch converts watchlist into a class. This will make implementing Special:Watchlist?feed=rss simpler.

Attached:

ayg wrote:

(In reply to comment #42)

  • There has never been a syndicated Special page in MediaWiki's codebase

before, and the existing code assumes that the corresponding feed is accessible
at Special:Watchlist?feed=rss, which is not the case (you have to use the
feedwatchlist API URL). There are two methods of going about solving this: one
is generating a fake request from Special:Watchlist code to feedwatchlist and
returning the code directly, and using the regular syndication link calculation
code, and the other is special-casing Special:Watchlist in OutputPage. I would
prefer the former, but both are fairly hacky.

Ideally this would be broken out into nice, clean, reusable code. I ran into this problem too when I was trying to make page history feeds available on article view instead of just page history view.

  • My preferred implementation approach is to assume that the user is using a

feedreader in their browser. If an unauthenticated user hits the RSS feed, we
publish an item explaining to them that they are not logged in, and give them
instructions on how to enable the "public" (should be phrased carefully) feed
that they can directly give to their feedreader. This means that the discovery
cost is minimal: a user can use the usual mechanism for subscribing to a feed,
without having to have had twiddled a preference beforehand.

Assuming that the feed reader supports logins seems like a really bad idea, TBH. A ton of people use feed readers like Google Reader or whatnot instead of their browser, and this will break horribly AFAICS, unless I'm missing something. I'm not even sure a majority of users use their browser for feed reading -- I don't, anyway.

(In reply to comment #42)

[...]

  • My preferred implementation approach is to assume that the user is using a

feedreader in their browser. If an unauthenticated user hits the RSS feed, we
publish an item explaining to them that they are not logged in, and give them
instructions on how to enable the "public" (should be phrased carefully) feed
that they can directly give to their feedreader. This means that the discovery
cost is minimal: a user can use the usual mechanism for subscribing to a feed,
without having to have had twiddled a preference beforehand.
[...]

What happens when the authentification cookie or whatever of
a user expires? It should be ensured (and tested :-)) that
in this case the items already received are not affected and
that afterwards, "new" items are added even if they are
older than the "not logged in" message.

(In reply to comment #45)

(In reply to comment #42)

[...]

  • My preferred implementation approach is to assume that the user is using a

feedreader in their browser. If an unauthenticated user hits the RSS feed, we
publish an item explaining to them that they are not logged in, and give them
instructions on how to enable the "public" (should be phrased carefully) feed
that they can directly give to their feedreader. This means that the discovery
cost is minimal: a user can use the usual mechanism for subscribing to a feed,
without having to have had twiddled a preference beforehand.
[...]

What happens when the authentification cookie or whatever of
a user expires? It should be ensured (and tested :-)) that
in this case the items already received are not affected and
that afterwards, "new" items are added even if they are
older than the "not logged in" message.

Any sane implementation will have a &from= parameter such that only results on or after the timestamp in from= are listed. If the client keeps track of the timestamp of the last edit it received, it can use this to get all results (common RSS clients don't actually do this, AFAIK, but it should be possible). *Never*, *ever*, should the cutoff time be based on cookie or login age; nobody suggested that BTW.

ezyang wrote:

Should feed-readers Do The Right Thing(TM) when feed items have properly unique IDs attached to them?

Anyway, I'm blocking on getting the Watchlist file converted. I suppose I could make the patch w/o having Watchlist as a class.

(In reply to comment #46)

Any sane implementation will have a &from= parameter such that only results on
or after the timestamp in from= are listed. If the client keeps track of the
timestamp of the last edit it received, it can use this to get all results

The watch list has (the watched pages have) "last visited" timestamps (for each user), thus no client action is needed.

ayg wrote:

I've committed a fix in r53703. It's token-based; you set a magic value in your preferences. Currently there's no UI that actually exposes the link location (I'm sure that will be added by someone shortly), but if you set the token, the links look like this:

api.php?action=feedwatchlist&list=watchlist&wluser=Simetrical&wltoken=91c1ef18279f9c24ccf67a79e899ae4d2a3201bc

[[Special:Version]] shows that the current MediaWiki revision live on the English Wikipedia is r54757; I assume, therefore, that this is already available. However, I can't find any option in my preferences that allows me to set the token as specified in the previous comment. How should this be done? Or was this feature disabled for some reason?

(In reply to comment #50)

[[Special:Version]] shows that the current MediaWiki revision live on the
English Wikipedia is r54757; I assume, therefore, that this is already
available. However, I can't find any option in my preferences that allows me to
set the token as specified in the previous comment. How should this be done? Or
was this feature disabled for some reason?

No, it's not been deployed yet. You can no longer use the revision number to determine which software is running on the cluster (look at the wmf-deployment branch)

Reopening, as there's still no UI for doing this which makes it nigh-useless. :)

  • Bug 20840 has been marked as a duplicate of this bug. ***

ayg wrote:

Well, the summary does say "basic". :) We could just file separate bugs for improvements beyond this, IMO. But it doesn't make much difference.

I can see a "Watchlist token" in the watchlist section of my preferences. I suppose that can be considered a basic UI. The link to the feed is not shown, though, and after entering it manually in the url following Simetrical's example, I get an error: "Error (wlnotloggedin): You must be logged-in to have a watchlist". Even though I am logged in to enwiki, I tried doing api.php?action=login&lgname=user&lgpassword=password, but I get the error "The login module requires a POST request". How is this supposed to work?

In theory, we should be showing the "feed" icons on the watchlist at the very least, and adding <link rel> tags for them.

Unfortunately, the way that feeds were implemented was incredibly short-sighted, and assumes that the only URL that you could possibly think of for a feed would be with a suffix on the existing URL.

What we need to do first is to overhaul this feed system (see bug 20692). This is something I could possibly look into.

ayg wrote:

Couldn't you manually generate the <link>s? Surely the feed class is only necessary for generating the feeds, not outputting links to them.

(In reply to comment #57)

Couldn't you manually generate the <link>s? Surely the feed class is only
necessary for generating the feeds, not outputting links to them.

It's the outputting of the <link>s by the skin, and the display of them in the sidebar, that is at issue, not the generation of the feeds.

This is the mechanism that sucks.

ayg wrote:

You could get <link>s in the header without putting them in the sidebar easily enough, though, by just using one of our several bazillion scattered, poorly-named methods for directly injecting stuff into the <head> in various confusing and massively overlapping ways. Ideally we'd have a single feed per page in most cases, IMO, with the RC feed (currently on every page IIRC) only available on Special:RecentChanges, and some way to get both the icons and the <link>s working in a unified fashion, via a rewrite of the feed system . . . but for now, you can just have multiple <link rel="alternate">s on the page pointing to unrelated feeds. That's permitted, AFAIK, or anyway we already do it for history pages.

(In reply to comment #59)

You could get <link>s in the header without putting them in the sidebar easily
enough, though, by just using one of our several bazillion scattered,
poorly-named methods for directly injecting stuff into the <head> in various
confusing and massively overlapping ways. Ideally we'd have a single feed per
page in most cases, IMO, with the RC feed (currently on every page IIRC) only
available on Special:RecentChanges, and some way to get both the icons and the
<link>s working in a unified fashion, via a rewrite of the feed system . . .
but for now, you can just have multiple <link rel="alternate">s on the page
pointing to unrelated feeds. That's permitted, AFAIK, or anyway we already do
it for history pages.

I'd much rather do the job properly, and spend the half hour rewriting the feed exposure system.

ayg wrote:

I doubt it will only take half an hour, but if you're really going to do it soon, then certainly that would be a better solution.

Feed URLs are now exposed on the watchlist page, if the user has set a token (r57119).

Would be nice if we could auto-set the token at some point instead of requiring that strange setup. I'm thinking that on loading the watchlist we could set it if it hasn't already been set.

(In reply to comment #62)

Would be nice if we could auto-set the token at some point instead of requiring
that strange setup. I'm thinking that on loading the watchlist we could set it
if it hasn't already been set.

That should be very straightforward -- I endorse this!

(In reply to comment #63)

(In reply to comment #62)

Would be nice if we could auto-set the token at some point instead of requiring
that strange setup. I'm thinking that on loading the watchlist we could set it
if it hasn't already been set.

That should be very straightforward -- I endorse this!

Done in r57124.