Page MenuHomePhabricator

Change the syntax for non-decimal numbers
Open, Needs TriagePublic


The current syntax for non-decimal numbers is awful, undocumented and weak. We should devise a deprecation process and replace it with, for instance, the one used in PHP. I think there won't be many use cases, but we still need to be careful in order to avoid breaking existing filters.


  • Check production use with info logging
  • Introduce new syntax as experimental
  • Ensure that no filter in production is already unintendedly using the new syntax
  • Bump logging for old syntax to deprecated, in MW version X
  • Remove old path, leave only the new one, in MW version X+1

Event Timeline

Daimona created this task.Dec 31 2018, 4:46 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 31 2018, 4:46 PM

Some data has started to flow to logstash for group0, and we have another example: testwiki filter 176, using the variable "db", evaluated to the number "b" in base 2. So well, we really need this change.

I see there are quite a lot of notices on logstash. However, I have to say that not all logged cases are actually recognized as numbers, because we check if the digits are OK. So for instance, "b" won't be considered valid in base 2 (while it would be valid hex).
Now I'm going to move the logging to a place where we already know if what we found is a valid number.

Change 527849 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Move non-decimal numbers deprecation logging

Daimona claimed this task.Aug 3 2019, 4:58 PM

Change 527849 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Move non-decimal numbers deprecation logging

Huji added a subscriber: Huji.Aug 3 2019, 8:29 PM

Obviously, keeping this open despite the merged patch, as the patch only helps with better logging of events and the problem persists.

Indeed. I'll examine logs again next week, hoping to find some useful data without all of those false positives. I still don't know how to proceed at that point, but anyway I'd like to reimplement non-decimal numbers like in PHP.
I'll probably add an alternate code path that will try to parse a number using this new syntax, report whether it succeeds or not, and again check the logs. If no filter was using non-decimal numbers (intendedly) and no filter is already using the new syntax for something else, then we may raise the logging to WARNING level, formally deprecating the old syntax. And then we'll leave the new syntax only in the next MW version.

No occurrences on logstash since wmf.17 reached production. I'll proceed as described in T212730#5391054.

Change 529475 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Add new number syntax as experimental

Daimona moved this task from Future to Under review on the User-Daimona board.Aug 13 2019, 3:20 PM

Change 529475 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Add new number syntax as experimental

Daimona updated the task description. (Show Details)Sat, Aug 31, 6:38 PM

Change 534903 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Parse numbers with the new syntax, too