AbuseFilter should expose matched text to warning messages
Open, Needs TriagePublic
Actions

Assigned To

None

Authored By

	He7d3r
	Aug 30 2017, 1:29 PM

Description

Sometimes a filter tests if a given edit matches any values in a list (or a given regex) and if so, warns the user about it. In those cases it would be useful to be able to show the matched text as part of the warning message. Examples:

This edit introduces the template "$1", which was deprecated after consensus (...)

This edit contains the phrase "$1", which may be considered offensive (...)

This would help a lot reviewing filters due to false positives, because the matching text would be easily available directly in the message.

Related Objects

Mentioned In: T265163: Create a system to encode best practices into editing experiences
T266380: Remove ContentTranslation code that emulates AbuseFilter, because it's hard to maintain
T72152: AbuseFilter doesn't highlight the match cases at abuse log
T216001: Allow abuse filter variables to be injected to warning template
Mentioned Here: T216001: Allow abuse filter variables to be injected to warning template

Event Timeline

He7d3r created this task.Aug 30 2017, 1:29 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 30 2017, 1:29 PM

If implemented at all, this should be definitely be *optional*. For filters that focus on improper language, outing, etc. we actually want NOT to show the matched phrase, so that the user cannot easily game it by modifying the matched phrase.

We very often design filters on the English Wikipedia to target a singular LTA - this would need to be optional, otherwise we're giving our long term abuse cases a simple way of bypassing a filter

The more I think about it, the less realistic this task is. What if one filter checks for multiple patterns, and more than one is matched? Not every filter is as simple as "match one pattern against the changed text". I am resisting the urge to mark this as Declined, though I cannot see this ever being addressed.

In that case it could show the first pattern which matches (or a list of the first N matches). E.g.: a filter checks for multiple syntax errors such as

added_lines irlike '<(div|s|center|td|small|font|span)/>'

and a user adds

<s>Never mind<s/>. Got it.

We should be able to tell the user that the problem comes from "<s/>".

Silent awarded a token.Dec 1 2017, 11:41 AM

Think of a complex filter like this:

...
&
(
  added_lines irlike 'pattern1'
  |
  (
    user_name irlike '(pattern2|pattern3)pattern5(pattern6|pattern7|pattern8)'
    &
    added_lines irlike 'pattern10'
  )
)
...

Of course the example is made up, but we do have complex filters that combine pattern matching through Boolean operators. In cases like this, returning a matched string is not good enough; user needs to know what was matched (username, added line, remove lines, ...) too. Again, to think about simple examples is not enough for solving this task.

Daimona mentioned this in T216001: Allow abuse filter variables to be injected to warning template.Feb 13 2019, 8:50 AM

DannyS712 subscribed.Feb 17 2019, 7:58 AM

Proposal: implement add_warning_params and instead of

added_lines irlike '<(div|s|center|td|small|font|span)/>'

matches := get_matches('<(div|s|center|td|small|font|span)/>', added_lines);
matches[1] !== false & add_warning_params(matches[1])

add_warning_param will always return true and could be variadic. Any argument added through this function will be added to the warning message as $3, $4, ... It must always be treated as raw HTML.

In T174554#5887960, @matej_suchanek wrote:
Proposal: implement add_warning_params and instead of
added_lines irlike '<(div|s|center|td|small|font|span)/>'
do
matches := get_matches('<(div|s|center|td|small|font|span)/>', added_lines);
matches !== false & add_warning_params(matches[1])
add_warning_param will always return true and could be variadic. Any argument added through this function will be added to the warning message as $3, $4, ... It must always be treated as raw HTML.

this would also work for T216001, right?

I would say yes.

matej_suchanek added a project: User-Daimona.Aug 29 2020, 1:07 PM

matej_suchanek moved this task from Backlog to Ideas for overhaul on the User-Daimona board.

What should happen when there are several invocations of add_warning_params? Appending arguments? Use the last one?

Appending arguments? Use the last one?

Good question. I believe the former. Alternatively, we could name it set_warning_params and implement the latter.

Reconsidering after the architecture review, which might make this easier to implement.

MusikAnimal subscribed.Oct 21 2020, 7:51 PM

matej_suchanek mentioned this in T72152: AbuseFilter doesn't highlight the match cases at abuse log.Nov 22 2020, 9:57 AM

Daimona mentioned this in T266380: Remove ContentTranslation code that emulates AbuseFilter, because it's hard to maintain.Dec 2 2020, 4:49 PM

Pginer-WMF mentioned this in T265163: Create a system to encode best practices into editing experiences.Apr 11 2022, 8:47 AM

Daimona merged a task: T72152: AbuseFilter doesn't highlight the match cases at abuse log.Jun 19 2022, 8:59 PM

Daimona merged a task: T310956: en.wp abuseFilter for new references to predatory OA journal/publisher should highlight which DOIs were matched/detected.

Daimona added subscribers: Yamaha5, Klein, Tgr and 7 others.

Daimona added subscribers: Prototyperspective, RhinosF1.

TheresNoTime removed a subscriber: RhinosF1.Dec 15 2022, 11:36 PM

Antanana subscribed.May 14 2024, 10:01 PM

AbuseFilter should expose matched text to warning messagesOpen, Needs TriagePublicActions

Description

Related Objects

Event Timeline

AbuseFilter should expose matched text to warning messages
Open, Needs TriagePublic
Actions