Page MenuHomePhabricator

Remove average run time and consumed conditions to make editing faster
Closed, ResolvedPublic

Description

Currently, each filter has statistics like this

On average, its run time is $4 ms, and it consumes $5 conditions of the condition limit."

This is very slow as there can easily be hundreds of filters, each doing 6 memcached queries. Xenon flamegraphs show a lot of time spent in this method, slowing down editing.

Event Timeline

He7d3r assigned this task to aaron.
He7d3r raised the priority of this task from to Needs Triage.
He7d3r updated the task description. (Show Details)
He7d3r subscribed.

Change 211951 merged by jenkins-bot:
Removed filter profiling using $wgMemc

https://gerrit.wikimedia.org/r/211951

I wonder if this could backfire. Without information on slowness, admins might unwittingly make filters slower than they currently are. Perhaps stats should be calculated every now and then.

It will backfire.
It makes our work of debugging filters harder than it already is.

Aren't those stats broken anyway? I think it's better if those are fixed first before it's restored. There's also an open patch to improve it.

Just going to write this here for reference... On enwiki we had our friend [[User:Dragons flight]] explain best practices to reduce runtime and condition count, that as long as you follow that and combine filters when possible, I think you should be ok without the stats.

Here's the link: https://en.wikipedia.org/wiki/Wikipedia_talk:Edit_filter/Archive_6#Condition_limit_3

Just going to write this here for reference... On enwiki we had our friend [[User:Dragons flight]] explain best practices to reduce runtime and condition count, that as long as you follow that and combine filters when possible, I think you should be ok without the stats.

Here's the link: https://en.wikipedia.org/wiki/Wikipedia_talk:Edit_filter/Archive_6#Condition_limit_3

Thanks. Best practices should be documented in https://www.mediawiki.org/wiki/Extension:AbuseFilter subpages. :)

Aren't those stats broken anyway? I think it's better if those are fixed first before it's restored. There's also an open patch to improve it.

What patch? Is this tracked somewhere?

Ah yes, actually forgot I already added it there, it's at https://www.mediawiki.org/wiki/Extension:AbuseFilter/Conditions

Notice that Extension:AbuseFilter/Conditions#General counting just transcludes the table I created after carefully examining the stats which were removed.
I would not have documented that in the first place, if these statistics were not available.

Ah yes, actually forgot I already added it there, it's at https://www.mediawiki.org/wiki/Extension:AbuseFilter/Conditions

Notice that Extension:AbuseFilter/Conditions#General counting just transcludes the table I created after carefully examining the stats which were removed.
I would not have documented that in the first place, if these statistics were not available.

Well we've got it documented now :) And kudos for making that table! As long as we're disciplined about authoring the filters we should be fine until more perfected stats are available.

We might even consider having an on-demand statistics tool. So maybe I could check a checkbox to "track performance of this filter" and when I see that it is doing well I can remove the performance tracking. There would be some notice making it clear the performance tracking also has a payload on saving of edits, so that the authors know not to leave it on indefinitely.

We might even consider having an on-demand statistics tool. So maybe I could check a checkbox to "track performance of this filter" and when I see that it is doing well I can remove the performance tracking.

This is an interesting idea. The tracking could disable itself after a certain number of hits (or checked actions). Care to file a feature request?

The ability to have it run somehow somewhere to get an idea of the efficacy or not of your filter is useful. Just turning it off is not helpful for those of who write filters but are not developers and just follow others examples in a means to stop the crap.

I believe the initial version of this (now removed) was reading three entries from memcache and then setting three updated entries per filter (as well as a few global keys that were updated per edit rather than per filter).

I already had a version in gerrit that reduced this to one read and one write per filter. If memcache requests are the bottleneck, we could easily go further and reduce it to one read and one write of structured data to memcache per edit rather than per filter. Aside from reading / writing memcache, I rather doubt that any of the rest of the internal profiling steps are generating meaningful lag. Counting conditions is already required for the condition limiter, and reading the system clock to track run time is pretty trivial, so I assume that nearly all of the delay is associated with the way memcache was being used. Though if someone wants to argue otherwise I'd be curious about your evidence?

Personally, I think removing all AF profiling is a very bad idea. We already have trouble with people understanding the performance of the filters, which was not aided by the profiling being buggy. In fact, the present profiling doesn't go far enough. We've had filters deployed (and subsequently removed) that had good average run times, but which could literally produce tens of seconds of lag against certain types of edits. We need to track not just average run time but also identify filters that occasionally produce extremely slow run times. (I also had a version in gerrit that added tracking of max run time among recent edits, but that depended upon the profiling existing.) For the same reason, I think on-demand profiling is probably not a good approach if we want to catch filters that are only occasionally slow. At least some level of performance capture needs to be always on or admins are going to be poking about blindly trying to find problems.

I'd suggest that reducing the memcache hits to once per edit (rather than several per filter) and possibly sampling (e.g. only profiling every Nth edit) would eliminate most of the lag while also being more useful and effective than on demand profiling.

I could work on those things myself down the line, but I'm busy in the near term. And frankly, given that all my previous AF edits in gerrit were ignored for months I'm not feeling much enthusiasm for working on this.

I'd suggest that reducing the memcache hits to once per edit (rather than several per filter) and possibly sampling (e.g. only profiling every Nth edit) would eliminate most of the lag while also being more useful and effective than on demand profiling.

This seems very sensible. @ori and @aaron, are you open to such a "partial revert" of your flat-out removal?

I'd suggest that reducing the memcache hits to once per edit (rather than several per filter) and possibly sampling (e.g. only profiling every Nth edit) would eliminate most of the lag while also being more useful and effective than on demand profiling.

This seems very sensible. @ori and @aaron, are you open to such a "partial revert" of your flat-out removal?

That's basically a suggestion I made in an email as a possibility (also using DeferredUpdates). I'd still *prefer* on-demand though.

We really do need some sort of profiling, especially with some of the very complex filters.