Condition limit is broken: consumes 100,000+ conditions limit is set to 1000
Open, HighPublic

Description

For a extreme example, consider this filter:
https://en.wikipedia.org/wiki/Special:AbuseFilter/473

How can it consumes 23296 conditions if the condition limit is set to 1000?

Why does that value changes every second?
(reloading the page a few times I got: 51172, then 3986, then 7859, 14690, and so on...)

This is very confusing for users trying to debug and optimize the existing filters...

See Also:
T47045: Condition limit outrageously high after filter is disabled
T43691: Abuse filter inconsistent matching between testing and real checking
T57459: Include "condition limit consumed" column in AbuseFilter list

Details

Reference
bz51294
bzimport set Reference to bz51294.
bzimport added a subscriber: Unknown Object (MLST).
He7d3r created this task.Jul 13 2013, 4:14 PM

I don't know if this is a problem, but I noticed the function getFilterProfile uses three keys:
wfMemcKey( 'abusefilter', 'profile', $filter, 'count' );
wfMemcKey( 'abusefilter', 'profile', $filter, 'total' );
wfMemcKey( 'abusefilter', 'profile-conds', 'total' );

But the resetFilterProfile only resets the first two. Shouldn't it also reset the key profile-conds?

[1] https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FAbuseFilter.git/HEAD/AbuseFilter.class.php#L535

(In reply to comment #0)

For a extreme example, consider this filter:
https://en.wikipedia.org/wiki/Special:AbuseFilter/473

How can it consumes 23296 conditions if the condition limit is set to 1000?

Where exactly can I see that (position in the UI)?

There is a line like this:

Statistics: Of the last 147 actions, this filter has matched 4 (2.72%). On average, its run time is 1.77 ms, and it consumes 13,105 conditions of the condition limit.

(the text comes from [[MediaWiki:abusefilter-edit-status]])

It really doesn't known what a "limit" is: I just saw it saying filter 473 consumes 112,686 conditions (!)

werdna removed a subscriber: werdna.Dec 10 2014, 5:18 PM

This is really three separate bugs that all lead to per filter stats corruption:

  1. Failure to reset properly when a filter is edited.
  2. Failure to count properly when evaluation is aborted due to the condition limit being reached.
  3. A race condition that causes near simultaneous instances of the AbuseFilter to corrupt each others stats.

The third problem is the dominant one for the highly active wikis and can lead to really silly results. Both per filter condition counts and run times are affected, though the nonsensical condition counts are more noticeable.

I am nearly done with a patch that addresses this bug.

Change 201104 had a related patch set uploaded (by Dragons flight):
Overhaul AbuseFilter internal profiling system

https://gerrit.wikimedia.org/r/201104

matej_suchanek assigned this task to Dragons_flight.
matej_suchanek set Security to None.
He7d3r edited the task description. (Show Details)Jan 6 2016, 9:31 PM
He7d3r removed a subscriber: wikibugs-l-list.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 6 2016, 9:31 PM
matmarex added a subscriber: werdna.Apr 8 2016, 5:44 PM
  1. A race condition that causes near simultaneous instances of the AbuseFilter to corrupt each others stats.

The third problem is the dominant one for the highly active wikis and can lead to really silly results.

I don't think this can reasonably account for the reported values being wrong by a factor of ten or more.

I'm pretty sure that the number of conditions is counted globally, for all filters together, then treated it as if it was the per-filter number, which obviously results in off-the-wall values. I wonder what @werdna was smoking when committing rSVN52738. ;)

Change 282399 had a related patch set uploaded (by Bartosz Dziewoński):
Mostly unbreak profiling of number of conditions used by filters

https://gerrit.wikimedia.org/r/282399

  1. A race condition that causes near simultaneous instances of the AbuseFilter to corrupt each others stats.

The third problem is the dominant one for the highly active wikis and can lead to really silly results.

I don't think this can reasonably account for the reported values being wrong by a factor of ten or more.

I'm pretty sure that the number of conditions is counted globally, for all filters together, then treated it as if it was the per-filter number, which obviously results in off-the-wall values. I wonder what @werdna was smoking when committing rSVN52738. ;)

The race condition can give rise to very large errors. The condition counts and times are all stored as ratios, e.g. (total time elapsed) / (number of runs recorded). Because the numerator and denominator are in separate variables they are retrieved and saved to memcache separately. Under race conditions it is possible for one instance to update and save the numerator while a different instance is overwriting the denominator, leading to possible states that are out of sync. This is especially severe is one of the instances in the process of reseting the variables. If the denominator gets reset to zero but the numerator gets stuck somewhere else, the ratio blows up leading to large errors.

That's not saying there aren't other profiling errors. It is a mess actually, but getting to reports like 100,000 conditions exhausted is mostly caused by the fact the memcache numerators and denominators can get out of sync with each other.

Most of the memcache calls need to be eliminated for both practical and performance reasons and replaced with a small number of calls using structured data.

Change 282399 merged by jenkins-bot:
Mostly unbreak profiling of number of conditions used by filters

https://gerrit.wikimedia.org/r/282399

If the denominator gets reset to zero but the numerator gets stuck somewhere else, the ratio blows up leading to large errors.

Hmm, I didn't think of this. That is more plausible.

Add Comment