Maniphest T120741

Suggesting AbuseFilter by machine learning
Open, Needs TriagePublic
Actions

Assigned To

None

Authored By

	• DannyH
	Dec 7 2015, 10:55 PM

Description

Currently, we should manually write down Extension:AbuseFilter's pattern (string match, regex, etc). It is a very hard task for non-technical users and consumes technical user's time.

I propose a machine learning approach that would suggest patterns to be used for AbuseFiliter; this would reduce difficulties and development time. For example, when I put marks on some revisions or users, I can get the suggested pattern generated by machine learning which extracts points in common among the specified revisions or user's contributions.

I don't have any concrete methods or implementations because I'm specialized in neither the machine learning or natural language processing. But I heard it's not impossible.

--aokomoriuta (talk) 23:56, 9 November 2015 (UTC)

This card tracks a proposal from the 2015 Community Wishlist Survey: https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey

This proposal received 0 support votes, and was ranked last out of 107 proposals. https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey/Moderation_and_admin_tools#Suggesting_AbuseFilter_by_machine_learning

Related Objects

Mentioned In: T291018: Temporarily disable article editing by anonymous users on fawiki

Event Timeline

• DannyH created this task.Dec 7 2015, 10:55 PM

• DannyH raised the priority of this task from to Needs Triage.

• DannyH updated the task description. (Show Details)

• DannyH added a project: Community-Wishlist-Survey-2015.

• DannyH moved this task to Wishlist 51-on on the Community-Wishlist-Survey-2015 board.

• DannyH subscribed.

Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptDec 7 2015, 10:55 PM

Reedy added a project: AbuseFilter.Dec 8 2015, 8:51 AM

Reedy set Security to None.

Billinghurst added a project: ORES.Dec 8 2015, 9:34 AM

@Halfak about integrating ORES with AbuseFilter

Legoktm subscribed.Dec 13 2015, 8:47 AM

• DannyH moved this task from Wishlist 51-on to Wishlist 51-on on the Community-Wishlist-Survey-2015 board.Dec 15 2015, 8:56 PM

• DannyH moved this task from Wishlist 51-on to Wishlist 51-on on the Community-Wishlist-Survey-2015 board.Dec 15 2015, 9:17 PM

IMPORTANT: If you are a community developer interested in working on this task: The Wikimedia Hackathon 2016 (Jerusalem, March 31 - April 3) focuses on #Community-Wishlist-Survey projects. There is some budget for sponsoring volunteer developers. THE DEADLINE TO REQUEST TRAVEL SPONSORSHIP IS TODAY, JANUARY 21. Exceptions can be made for developers focusing on Community Wishlist projects until the end of Sunday 24, but not beyond. If you or someone you know is interested, please REGISTER NOW.

Restricted Application added a subscriber: JEumerus. · View Herald TranscriptJan 21 2016, 2:51 PM

• DannyH updated the task description. (Show Details)Feb 6 2016, 12:35 AM

Vlkyrie subscribed.Mar 5 2016, 12:40 PM

Amire80 moved this task from Backlog to Filtering features on the AbuseFilter board.May 8 2016, 9:13 AM

Ricordisamoa awarded a token.May 8 2016, 11:58 AM

BethNaught subscribed.Jan 28 2017, 1:28 PM

Restricted Application added a project: Machine-Learning-Team. · View Herald TranscriptJan 28 2017, 1:28 PM

Huji updated the task description. (Show Details)Jan 28 2017, 4:57 PM

MarcoAurelio awarded a token.Jan 28 2017, 9:19 PM

MarcoAurelio subscribed.

Bad-Words-Detection-System sounds very similar to this. We use BWDS to find content that is added in revisions that get reverted, but is also uncommon in revisions that are not reverted. We'd probably want to put some sort of user-interface on top of it in order to make it work ad-hoc.

Halfak edited projects, added Bad-Words-Detection-System; removed ORES.Feb 2 2017, 3:30 PM

Halfak moved this task from Unsorted to Ideas on the Machine-Learning-Team board.

Seems a lot easier to just provide better editing tools (for example something regex101-ish for the regular expressions and Blocky or something similar for the general program structure).

@Tgr, I think this task is about discovering textual patterns in diffs that are worth flagging in AbuseFilter. I'm not sure how better editing tools would make it easier to discover these patterns.

Part of the problem statement was that regexes are hard to use for non-technical users. I was just suggesting that writing an AI might not be the most effective approach to fix that. (For one thing, they still need to be able to evaluate whatever regex suggestions a machine learning system would come up with.)

Oh! That's a good point. Seems like we're looking at two separate problems here. (1) regex syntax is hard and (2) discovering patterns that should exist in abusefilter is hard.

Perhaps instead of suggesting patterns, some kind of score could be passed to filters.

Halfak edited projects, added artificial-intelligence; removed Machine-Learning-Team.Mar 16 2017, 2:41 PM

He7d3r awarded a token.Aug 30 2017, 1:17 PM

He7d3r subscribed.

Liuxinyu970226 awarded a token.Nov 19 2017, 12:24 PM

Liuxinyu970226 rescinded a token.

Liuxinyu970226 awarded a token.

Liuxinyu970226 subscribed.

Making ORES scores available to AbuseFilter as variables would be a relatively easy way to partially fix this.

MGChecker added a project: ORES.Feb 10 2018, 7:30 PM

Capankajsmilyo subscribed.Apr 18 2018, 2:21 PM

Agreed with MGChecker. There's no need to write from scratch a new AI for detecting vandalism. We already have ORES, let's keep improving it and make it communicate with AF. It'd be a huge advantage to have its score available in AF. I just don't know if it's feasible.

Galobtter subscribed.Jan 27 2019, 5:56 PM

Ahmad252 subscribed.Oct 23 2020, 1:53 PM

Strainu mentioned this in T291018: Temporarily disable article editing by anonymous users on fawiki.Oct 10 2021, 7:47 PM

Suggesting AbuseFilter by machine learningOpen, Needs TriagePublicActions

Description

Related Objects

Event Timeline

Suggesting AbuseFilter by machine learning
Open, Needs TriagePublic
Actions