Page MenuHomePhabricator

Add AbuseFilter integration to Extension:Newsletter
Open, MediumPublic

Description

When creating or editing a Newsletter, call out to AbuseFilter::filterAction if installed and have it evaluate rules under a new "newsletter" group. This will prevent adding abusive content to newsletter names and descriptions.

Details

Related Gerrit Patches:
mediawiki/extensions/Newsletter : masterFix error handling and abuse filter integration during edit
mediawiki/extensions/Newsletter : master[Draft] Add AbuseFilter integration to Newsletter

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 7 2016, 4:54 AM
Qgil added subscribers: csteipp, Qgil.Apr 7 2016, 6:40 AM

./includes/specials/SpecialNewsletterCreate.php

  • No spam filter integration. Even without wikitext, users can still put abusive stuff in the names/descriptions.

I presume this involves AbuseFilter integration? This can probably be non-blocking (maybe I'm wrong), because I imagine on WMF wikis the newsletter-create privilege will not be given out to just anybody, similar to gadgets.

As far as I remember, currently this permission is available to registered (autoconfirmed?) users. The idea behind this openness is that creating a new newsletter entry is no different than creating a new wiki page, and the harm you can cause is related to the number of subscribers to your newsletter, which would be high only if your newsletter is interesting.

The good point is that spammers / vandals could spam / vandalize the list of newsletters. But spamming/vandalizing users through Echo notifications is not a realistic use case, unless someone goes berzerk after months of proper newsletter publishing.

But spamming/vandalizing users through Echo notifications is not a realistic use case

It is. Oh yes it is.

Qgil added a comment.Apr 15 2016, 2:55 PM

What I mean is that only users that have subscribed to a newsletter can be spammed with notifications by the authors of that newsletter. Since there is no possibility for authors to subscribe users, there is no possibility for them to spam 1000 users unless these users have subscribed voluntarily to that newsletter, and 1000 users will do that only when the content of the newsletter is interesting to them.

This breaks the usual patterns of spam and vandalism.

Qgil triaged this task as Medium priority.Apr 15 2016, 2:55 PM

It would still be possible to add vandalism (including suppressable content) to logs by vandals.

<Glaisher> Oh.. Maybe we could emit an event/hook whenever there is a newsletter change and then add a handler in AbuseFilter
<Glaisher> and have a separate group for newsletters like in flow
<Glaisher> and introduce newsletter_* variables
<Glaisher> Oh.. Maybe we could emit an event/hook whenever there is a newsletter change and then add a handler in AbuseFilter
<Glaisher> and have a separate group for newsletters like in flow
<Glaisher> and introduce newsletter_* variables

That seems like a plan. Let's see if it works out!

Tinaj1234 removed Tinaj1234 as the assignee of this task.Jun 9 2016, 8:43 AM
Tinaj1234 added a subscriber: Tinaj1234.
Qgil added a comment.Jun 9 2016, 8:57 AM

FYI

./includes/specials/SpecialNewsletterCreate.php

  • No spam filter integration. Even without wikitext, users can still put abusive stuff in the names/descriptions.

I presume this involves AbuseFilter integration? This can probably be non-blocking (maybe I'm wrong), because I imagine on WMF wikis the newsletter-create privilege will not be given out to just anybody, similar to gadgets.

Can we make T132022: Add AbuseFilter integration to Extension:Newsletter not a blocker for deployment indeed, please? AbuseFilter integration is undocumented (T135668) and despite of this fact we have been trying to implement it, looking at how other projects are doing it and asking to developers. However, those who could know seem to be very busy and we still haven't got this to work.
As @Parent5446 points out, we could start by granting the permission to create newsletters to a more restricted user group (currently is autoconfirmed), and then relax it again when we succeed implementing AbuseFilter integration.

Glaisher added a comment.EditedJun 9 2016, 9:41 AM

Adding hooks is probably not necessary. I was just looking at Flow code and it looks like this will be simpler than I imagined. The basic idea would be something like this whenever a user does something.

$vars = new AbuseFilterVariableHolder;
$vars->addHolders( AbuseFilter::generateUserVars( $context->getUser() ) ); // User variables
$vars->setVar( 'action', $type ); // $type can be 'announce', 'modify', etc.
// Set newsletter-specific variables
// We can probably add other "meta" variables like length/diff later
$status = AbuseFilter::filterAction( args );
if ( !$status->isOK() ) {
 // Show error
}

And introduce our own group. I haven't looked at the code for doing that.

Maybe take a look at the AbuseFilter integration in Flow (https://github.com/wikimedia/mediawiki-extensions-Flow/blob/master/includes/SpamFilter/AbuseFilter.php)

First, you'll have to add the "newsletter" group:

$wgAbuseFilterValidGroups[] = 'newsletter';

And define when a "newsletter" filter should be deactivated (that's a safeguard so that when a filter is incorrect and matches an abnormal proportion of items, it'll automaticalle be disabled)

$wgAbuseFilterEmergencyDisableThreshold['newsletter'] = .1; // disable filters that hit > 10% of articles
// we'll want the filters to automatically disable if they match >10% of articles, but
// we need to make sure it has tested at least 50 articles in the last day (=86400 seconds)
$wgAbuseFilterEmergencyDisableCount['newsletter'] = 50;
$wgAbuseFilterEmergencyDisableAge['newsletter'] = 86400;

Up to the code, then! How do we make sure all "newsletter" filters are executed against our content?
First, we make sure that the filters have all the variables they would need (e.g. new_wikitext, which contains the new content, where the filters should be matched against; or new_size, or added_lines, or user_blocked, or ...)

Here's some basic variables, but if others are useful in this usecase you can always add more!

// get a default set of variables based on article title & submitting user
// $title is a \Title object
// $user is a \User object
$vars = \AbuseFilter::getEditVars( $title );
$vars->addHolders( \AbuseFilter::generateUserVars( $user ), \AbuseFilter::generateTitleVars( $title , 'ARTICLE' ) );

The main variables we're missing at this point are new_wikitext and old_wikitext.
You could assign these like this:

// $newWikitext is a string that contains the wikitext of the current revision being tested
// $oldWikitext is a string that contains the wikitext of the previous revision (or an empty string if there is no previous revision)
$vars->setVar( 'new_wikitext', $newWikitext );
$vars->setVar( 'old_wikitext', $oldWikitext );

If you don't happen to already have the current & previous wikitext at hand (e.g. you have to fetch it from database, or run it through Parsoid), you're better of not calling ->setVar() directly, but assign them using a callback. That way, they'll only be fetched when they're actually being used! Like this:

// register a callback method, which will only be executed if the variable is actually used
// we can pass additional data to the callback function; in this case, we'll pass it a $revision
// object (something very Flow-specific, so that will definitely be different for other implementations)
$vars->setLazyLoadVar( 'new_wikitext', 'my_custom_content_callback', array( 'revision' => $revision ) );

// and now that we've registered that variable as a callback function, we need to define that
// callback function, which can be done inside one of AbuseFilter's hooks:
$wgHooks['AbuseFilter-computeVariable'][] = 'MyHooks::onAbuseFilterComputeVariable';
class MyHooks {
    /**
     * @param string $method: Method to generate the variable
     * @param AbuseFilterVariableHolder $vars
     * @param array $parameters Parameters with data to compute the value
     * @param mixed &$result Result of the computation
     * @return bool
     */
    public static function onAbuseFilterComputeVariable( $method, \AbuseFilterVariableHolder $vars, $parameters, &$result ) {
        $methods = array(
            'my_custom_content_callback' => function ( $parameters ) {
                // now you can use $parameters to go fetch whatever data is needed
                // $parameters is the array that was passed to `->setLazyLoadVar()`, which
                // in this example was: array( 'revision' => $revision )

                // so this all depends exactly on your implementation; for Flow, we could do this:
                return $parameters['revision']->getContentInWikitext();
            };

            if ( !isset( $methods[$method] ) ) {
                return true;
            }

            $result = $methods[$method]( $parameters );
            return false;
        );
    }
}

Now that AbuseFilter knows out filter group & all of the variables that can be used in the filters, we're almost there!

// $vars is the object we've been creating in the code above
// $title is still the \Title object referencing the wiki page we're testing for
// 'newsletter' is the group whose filters we'll want to test
// AbuseFilter will execute all of these filters and return a \Status object
$status = \AbuseFilter::filterAction( $vars, $title, 'newsletter' );

// now the \Status object can be used to check if AbuseFilter found violations in the content:
$status->isOK();

// or get the error message
$status->getMessage();

// ...
`

Good luck!

@matthiasmullie Thanks a ton! Couldn't have explained it any better :)

Change 293625 had a related patch set uploaded (by Tinaj1234):
[Draft] Add AbuseFilter to Extension:Newsletter

https://gerrit.wikimedia.org/r/293625

Qgil added a comment.Dec 28 2016, 10:47 AM

Could someone post an update about the situation of this task? I could not tell by reading the comments in the patch. I am not even sure about how the whole ContentHandler move affects this task.

In theory content handler should make abuse filter integration autoamtic. We should test to see if that actually works properly

In theory content handler should make abuse filter integration autoamtic. We should test to see if that actually works properly

Looks like that is wrong... Abusefilter didn't actually work.

It looks like abusefilter stops the edit, but not the custom newsletter table edits

Sorry for the offtopic but does Newsletter have CheckUser integration? If not, it must have it.

Change 371743 had a related patch set uploaded (by Brian Wolff; owner: Brian Wolff):
[mediawiki/extensions/Newsletter@master] Fix error handling and abuse filter integration during edit

https://gerrit.wikimedia.org/r/371743

Sorry for the offtopic but does Newsletter have CheckUser integration? If not, it must have it.

Yep check user is fine (One of the benefits of content handler is that happens automatically)

Change 371743 had a related patch set uploaded (by Brian Wolff; owner: Brian Wolff):
[mediawiki/extensions/Newsletter@master] Fix error handling and abuse filter integration during edit
https://gerrit.wikimedia.org/r/371743

It should be emphasized that this patch only covers editing an existing newsletter case. I have not tested the create a newsletter case, and the code path diverges so its reasonable to assume abuse filter may not work correctly there (It might block the edit, but not stop newsletter db entries, and not show an error)