Variable added_lines is now an array instead of a string
OpenPublic

Description
  1. Create a page with "{{this is}}"
  2. Edit the page and replace its content by "*THIS IS A BUG ON ABUSE FILTER!":

https://pt.wikibooks.org/wiki/Project:Caixa_de_areia?diff=258499

  1. Open [[Special:AbuseFilter/test]], which in my case was

https://pt.wikibooks.org/wiki/Special:AbuseFilter/test?uselang=en

  1. Type ------------------------------------------------------------------------

length(added_lines) <= 2 & contains_any( added_lines, "THIS IS A BUG" )

  1. Click "Test" and bang! The edit is detected! If it is of any help, here is the link provided to "examine" the edit:

https://pt.wikibooks.org/w/index.php?title=Especial:Filtro_de_abusos/examine/262161&testfilter=length%28added_lines%29+%3D%3D+1+%26+contains_any%28+added_lines%2C+"THIS+IS+A+BUG"+%29%0A&uselang=en

The filter shouldn't detect it, because "added_lines" obviously has a lot more than 2 characters. Nonetheless, if I replace "<= 2" by "== 1" it is still detected.

How can a string of 31 characters have length 1? This is plain wrong...


Version: unspecified
Severity: major
URL: https://pt.wikibooks.org/w/index.php?title=Especial:Filtro_de_abusos/examine/262161&testfilter=length%28added_lines%29+%3C%3D+15+%26+summary+%3D%3D+%22%2ATHIS+IS+A+BUG+ON+ABUSE+FILTER%21%22
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=40478

bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz50107.
He7d3r created this task.Via LegacyJun 24 2013, 3:40 PM
He7d3r added a comment.Via ConduitJun 25 2013, 1:07 PM

Ok, so I made one more test at
https://pt.wikibooks.org/w/index.php?title=Wikilivros%3ACaixa_de_areia&diff=258547&oldid=258546

and used Special:AbuseFilter/test to check the following filter:

length(added_lines) == 3 & contains_any( added_lines, "Lorem" )

and the edit was detected.

It seems abuse filter's "added_lines" is an array and then "length(added_lines)" is the number of items on this array.

Such a behavior is not documented at
https://www.mediawiki.org/wiki/Extension:AbuseFilter/Rules_format
and it is not obvious from
https://pt.wikibooks.org/w/index.php?title=Especial:Filtro_de_abusos/examine/262215&testfilter=length%28added_lines%29+%3D%3D+3+%26+contains_any%28+added_lines%2C+%22Lorem%22+%29%0A

that the value of "added_lines" is an array (not an string).

He7d3r added a comment.Via ConduitJul 3 2013, 3:17 PM

A similar problem happens when I use "user_groups":
For a user ,

The test

"confirmed" in user_groups

matches a user which is in the group "autoconfirmed" but not in the group "confirmed". But the documentation[1] says "in" applies only to "strings", so why when I execute

length(user_groups)=4

I also get a match? (the user in question is in exactly 4 groups)

If "user_groups" is a string, its length should be at least the length of "confirmed" (which is 9), not 4.

[1] http://www.mediawiki.org/wiki/Extension:AbuseFilter/Rules_format#Keywords

He7d3r added a comment.Via ConduitJul 5 2013, 4:21 PM

Here is another test which returns 1 (which means "true"):

(list:= ["a", "b", "c"];) & contains_any(list,"a\n")

I think the real bug is we have no documentation about these variables being arrays and how they are converted to strings (notice the "\n" in the above example).

He7d3r added a comment.Via ConduitJul 5 2013, 4:26 PM

See also:

  • [[mw:Extension_talk:AbuseFilter/Rules_format#Please_add_description_of_variable_values]]
He7d3r added a comment.Via ConduitJul 14 2013, 7:51 PM

According to
https://ru.wikipedia.org/wiki/Special:AbuseFilter/history/99/diff/prev/2152
the
the variable "added_lines" was changed to an array somewhere from 13 February 2013 and as such "length (added_lines)" now gets the number of rows, and this changed the conditions under which existing filters were triggered.

He7d3r added a comment.Via ConduitJul 14 2013, 8:07 PM

Is this working like this since 2009??? I found the following on r49221:

  • Use lists instead of implode()d strings in built-in variables wherever it's possible

ATTENTION! This may break filters that rely on "added_lines contains 'bla-bla'" syntax. They'll need to be replaced with "string(added_lines) contains 'bla-bla'"

Aklapper added a comment.Via ConduitJul 15 2013, 10:50 AM

No need for "[regression]" in subject if you also set the keyword.
However, not sure if I would call it a regression if it's been like this for four years.

He7d3r added a comment.Via ConduitJul 15 2013, 5:48 PM

Another example of the confusion caused by this (from 17 June 2009):

  • [[Special:AbuseFilter/history/172/diff/prev/2121]]
  • [[Wikipedia talk:Edit filter/Archive 3#Filter stopped working - strange testing results ..]]
Ciencia_Al_Poder added a comment.Via ConduitAug 16 2013, 1:13 PM

(In reply to comment #9)

See also:
https://www.mediawiki.org/wiki/Extension:AbuseFilter/Rules_format#Conditions

(added and documented by Helder)

Whoops, ignore that, wrong bug!

Aklapper added a project: Regression.Via WebDec 5 2014, 2:52 PM
werdna removed a subscriber: werdna.Via WebDec 10 2014, 5:14 PM
Aklapper added a project: Documentation.Via WebJan 7 2015, 3:31 PM

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.