Page MenuHomePhabricator

Allow Comments to be caught by AbuseFilter
Closed, ResolvedPublicFeature

Description

I hope this has not already been requested in the past and declined.

Feature summary (what you would like to be able to do and where): Allow comments made to be caught by the AbuseFilter extension. (It would just have to be for logging purposes, as I guess 'warn' and 'disallow' could be harder to implement)

Use case(s) (list the steps that you performed to discover that problem, and describe the actual underlying problem which you want to solve. Do not describe only a solution): Quite obvious

Benefits (why should this be implemented?) : In order to prevent preventing users from circumventing AbuseFilters by using the comments extension which is not caught by the AF.

Event Timeline

I hope this has not already been requested in the past and declined.

Not that I'm aware of, no. If it were, that'd be horribly silly because surely it makes sense from a developer as well as an end-user POV for anti-spam filters to be consistent. :)

Allow comments made to be caught by the AbuseFilter extension.

👍

(It would just have to be for logging purposes, as I guess 'warn' and 'disallow' could be harder to implement)

I think it makes sense to do this properly, because it's what most people using AF + Comments concurrently would desire: the software to not special-case comments and, for the purposes of filtering them, treating them no differently than regular page text (or edit summaries, etc.)

Furthermore some (very rudimentary) checks are already in place in includes/api/CommentSubmitAPI.php: if comment text is not empty, it is then checked for whether it

  1. was (not) submitted by a user with the commentadmin user right and it matches known definitions of spam, as determined by CommentFunctions::isSpam -- this currently checks for:

1.1) any and all functions hooked to Comments::isSpam hook called by the method (currently the only extension using this hook is SpamRegex)
1.2) $wgSpamRegex
1.3) $wgSummarySpamRegex

  1. contains any links and that the user is allowed to post links (=they have the commentlinks user right)

The Comments::isSpam hook has pretty much been designed for this use case: it provides access to &$text (user-submitted comment text contents) and &$retVal (boolean value answering the question "is this spam?").

That said, judging from prior experience (d8fe3acd7c747ff1fc912bae4c02f726cdb16493) with ArticleFeedbackv5 there might be something I'm missing here, so cc'ing @Daimona as the local AF guru for further input on how to best handle this situation.

Unfortunately, I'm not familiar with Ext:Comments at all and cannot really help with the specific problem at hand. However, in general, I'd say:

  • If the process of adding a comment is very similar to making edits (i.e. it happens on a wiki page, there's a summary, there's added and removed text etc.), it could be handled by AF as a normal edit, as long as the EditFilterMergedContent is fired when the comment is being added, and the extension hooks into whatever is necessary to compute the variables correctly. Aside for normal edits, this is suitable e.g. for edits to slots other than the main one, even though you have to manually run the EditFilterMergedContent hook, which I recognize is very suboptimal (see T266464).
  • Otherwise, it's more complicated. AFTv5 and Flow trigger AF for their actions using an action other than edit, and do so by manually calling the deprecated AbuseFilter::filterAction. That's not ideal, too, and I'm not sure if the AF code will remain stable for such use cases in the future.

Unfortunately, I'm not familiar with Ext:Comments at all and cannot really help with the specific problem at hand.

Thanks for the reply, Daimona! Unfortunately for Comments it's indeed closer to the latter than the former case, I fear - though comments do have to be "associated" with a page (i.e. a page has to contain <comments /> for you to be able to comment on it; thus it logically follows that special pages and nonexistent (not created yet, deleted, ...) pages cannot be commented on) and this page_id value is indeed stored in Comments' main DB table, Comments, in the Comment_Page_ID column, that's pretty much where the similarities to "normal" edits end, I'd say.

Comments nowadays (since last year, in fact) requires the end-user to have JS enabled in their browser; the JS submits a POST request with the appropriate headers (CSRF protection, etc.) set to the api.php endpoint action=commentsubmit, handled by the PHP code in includes/api/CommentSubmitAPI.php.
The code here does various permission and sanity checks and eventually passes the user-submitted data down to Comment::add if everything is OK; that method does the main INSERT and is also responsible for updating social stats (if SocialProfile is installed) and pinging @mentioned users (if Notifications is installed).

I suppose the "cleanest" solution (as of right now) would be...to invoke AF manually in includes/api/CommentSubmitAPI.php after the regular anti-spam/anti-abuse checks are passed but before calling Comment::add? (I took a look at the AF<-->AFTv5 integration code and it looks mostly straightforward, save for the custom Consequence class, I'm not quite sure how that works...)

I suppose the "cleanest" solution (as of right now) would be...to invoke AF manually in includes/api/CommentSubmitAPI.php after the regular anti-spam/anti-abuse checks are passed but before calling Comment::add?

Yes, calling AF manually would be the best approach here. As I said, I've never really focused on that use case, so I recognize that the AF public interface for custom callers is not very pretty. Nonetheless, it should work.

(I took a look at the AF<-->AFTv5 integration code and it looks mostly straightforward, save for the custom Consequence class, I'm not quite sure how that works...)

You probably don't need custom consequences for that, just add the variables. What you may want instead, is to define a separate group of filters. You can take a look at how Flow does that: this is the class which runs AF, and you will also need to implement some hooks; anything AF-related in the Flow hooks is probably necessary.

Change 761904 had a related patch set uploaded (by Jack Phoenix; author: Jack Phoenix):

[mediawiki/extensions/Comments@master] AbuseFilter interoperability

https://gerrit.wikimedia.org/r/761904

Change 761904 merged by jenkins-bot:

[mediawiki/extensions/Comments@master] AbuseFilter interoperability

https://gerrit.wikimedia.org/r/761904

ashley claimed this task.

Thanks @Daimona for your help & code review in making this a reality! 😄 👍

Thanks for such a quick solution!