Page MenuHomePhabricator

AbuseFilter should expose matched text to warning messages
Open, Needs TriagePublic

Description

Sometimes a filter tests if a given edit matches any values in a list (or a given regex) and if so, warns the user about it. In those cases it would be useful to be able to show the matched text as part of the warning message. Examples:

This edit introduces the template "$1", which was deprecated after consensus (...)

or

This edit contains the phrase "$1", which may be considered offensive (...)

This would help a lot reviewing filters due to false positives, because the matching text would be easily available directly in the message.

See also: pt:Wikipédia:Café dos programadores#Etiqueta: Inserção de predefinição obsoleta

Event Timeline

He7d3r created this task.Aug 30 2017, 1:29 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 30 2017, 1:29 PM
Huji added a subscriber: Huji.Aug 30 2017, 1:36 PM

If implemented at all, this should be definitely be *optional*. For filters that focus on improper language, outing, etc. we actually want NOT to show the matched phrase, so that the user cannot easily game it by modifying the matched phrase.

Samtar added a subscriber: Samtar.Aug 30 2017, 1:40 PM

We very often design filters on the English Wikipedia to target a singular LTA - this would need to be optional, otherwise we're giving our long term abuse cases a simple way of bypassing a filter

Huji added a comment.Sep 2 2017, 3:01 PM

The more I think about it, the less realistic this task is. What if one filter checks for multiple patterns, and more than one is matched? Not every filter is as simple as "match one pattern against the changed text". I am resisting the urge to mark this as Declined, though I cannot see this ever being addressed.

In that case it could show the first pattern which matches (or a list of the first N matches). E.g.: a filter checks for multiple syntax errors such as

added_lines irlike '<(div|s|center|td|small|font|span)/>'

and a user adds

<s>Never mind<s/>. Got it.

We should be able to tell the user that the problem comes from "<s/>".

Huji added a comment.Dec 2 2017, 3:36 AM

Think of a complex filter like this:

...
&
(
  added_lines irlike 'pattern1'
  |
  (
    user_name irlike '(pattern2|pattern3)pattern5(pattern6|pattern7|pattern8)'
    &
    added_lines irlike 'pattern10'
  )
)
...

Of course the example is made up, but we do have complex filters that combine pattern matching through Boolean operators. In cases like this, returning a matched string is not good enough; user needs to know what was matched (username, added line, remove lines, ...) too. Again, to think about simple examples is not enough for solving this task.