Page MenuHomePhabricator

Find a better way to compute variables when testing filter syntax
Closed, ResolvedPublic

Description

While checking filter syntax (i.e. before saving a new filter, or anywhere else), edit variables are all set to null. This may sound like a natural choice, but can produce unwanted results. For instance, if I try to save a filter with

length(string(added_lines)) > 100 &
5 / length(string(added_lines)) > 2

I get the error "abusefilter-exception-dividebyzero". This shouldn't happen for two reasons: first, because there's a condition right above on the length of added_lines (but I don't expect the parser to be so smart to understand it). Second, because even without the first condition this is not always guaranteed to be a division by zero.
Probably it's not a good idea to save a filter with only the second condition, but I'm not sure if it's a good idea to completely prevent it.

Note: arrays with variable length should also be allowed: the syntax check doesn't allow accessing an array element if such array is generated at runtime (for instance it is added_lines), although it makes sense to do so. To reproduce the problem you may try checking added_lines[0], although that gives the wrong error (T198531). Solving this problem would also allow us to greatly simplify the logic behind get_matches, see comments in https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/AbuseFilter/+/470987/.

Details

Related Gerrit Patches:
mediawiki/extensions/AbuseFilter : masterStart using AFPData::DNONE

Event Timeline

Daimona created this task.Sep 18 2018, 8:44 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 18 2018, 8:44 AM
Daimona added a comment.EditedNov 14 2018, 5:08 PM

Note: arrays with variable length should also be allowed: the syntax check doesn't allow accessing an array element if such array is generated at runtime (for instance it is added_lines), although it makes sense to do so. To reproduce the problem you may try checking added_lines[0], although that gives the wrong error (T198531). Moved to task description.

Random idea: add a new AFPData type, e.g. DNONE, to tell between actual null variables, and non-initialized ones; make the parser be permissive when it encounters this data type. I wrote a stub-patch for this some weeks ago, but stopped working on it as new problems came out... I'm also not that confident in making such a breaking change to the parser, although I think it's worth investigating this idea.

Daimona updated the task description. (Show Details)Dec 5 2018, 12:17 PM

Change 478424 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] [WIP] Add a new data type for non-initialized stuff

https://gerrit.wikimedia.org/r/478424

Daimona claimed this task.Dec 8 2018, 6:31 PM

Change 478424 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] [WIP] Add a new data type for non-initialized stuff

https://gerrit.wikimedia.org/r/478424

Change 478424 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Start using AFPData::DNONE

https://gerrit.wikimedia.org/r/478424

Daimona closed this task as Resolved.Aug 3 2019, 1:31 PM
Daimona removed a project: Patch-For-Review.

This was resolved by introducing DNONE, which allows everything mentioned in the task description. We can still improve the syntax check, e.g. by really just checking the syntax instead of trying to parse the filter, but that's T214643. Plus, I'm not going to simplify get_matches (yet), as the fixed-length format is less error prone.