Page MenuHomePhabricator

On-wiki consultation about surfacing AbuseFilter perf. measurements to Edit filter managers
Closed, ResolvedPublic

Description

Topics to discuss with Edit filter managers on ENWP and Meta:

  • How can we surface data about the slowest filters? (Currently logged in Logstash, which is permissioned)
  • How can we surface data about the number of conditions being hit in the Special:AbuseFilter interface? (Currently in Grafana, not permissioned.)

Event Timeline

@kaldari @MusikAnimal @SPoore @dbarratt @dmaza — I want to post this on the Edit filter noticeboard. Is it all accurate? Am I walking into any traps? Am I overcommitting what AHT should be responsible for?

-----

Hello Edit filter managers;

Over the past few months the Anti-Harassment Tools team at the Wikimedia Foundation has added some additional forms of performance measurement to AbuseFilter to better understand the impact it is causing on users’ edits.

Now that this data is being measured, I’d like to talk about how we can make it more easily available to you as you manage filters. We can make this very unintrusive (e.g. just add links) or could make some more significant changes with your approval and permission. Here are a few ideas to start the discussion:

  • Expand the “Of the last N actions…” sentence to include the average condition count.
  • Add a “Statistics” section on Special:AbuseFilter and show all the existing statistics as well as new data from Grafana and Logstash.
  • Display an icon by the slow filters on the All filters list
  • On the Filter parameters page, indicate if the filter is taking over 700 milliseconds.
  • …something else?

Would any of these be helpful to you? Or do you have any suggestions? We won’t make any changes without your consensus, but I want to make our team available if these changes (or similar) would be helpful.

Thank you! — ~~~~

Looks great, and the new features you are proposing are very exciting!

One thing I've been meaning to say... I'm not sure how helpful the per-filter stats have been. I believe @dmaza said it might just be reporting the last run time, not the "average" as it says it does? I say this because I've noticed that filters that regularly show up on the AbuseFilterSlow dashboard still have very low average run times. We might want to look into that first, just to be sure?

Either way, bullets #3 and 4 (surfacing slow filters within the interface) would be a wondrous improvement. I've been working with edit filter managers to improve their filters, and currently they are counting on me to let them know if changes we made helped with performance, since they are unable to tell from the "average" run time.

One thing I've been meaning to say... I'm not sure how helpful the per-filter stats have been. I believe @dmaza said it might just be reporting the last run time, not the "average" as it says it does? I say this because I've noticed that filters that regularly show up on the AbuseFilterSlow dashboard still have very low average run times. We might want to look into that first, just to be sure?

If this is true then we may want to remove it entirely as it might be doing more harm than good. @dmaza what do you remember about the accuracy of the profiling?

If this is true then we may want to remove it entirely as it might be doing more harm than good. @dmaza what do you remember about the accuracy of the profiling?

I don't remember, I'll have to check again.

I talked to @dmaza on IRC, and I think we've confirmed it is showing an average run time, and is probably mostly correct. The issue is there might be just one edit every minute or so for which the filter takes a really long time to run -- but the average is still really low. So it's not that it's inaccurate, it's just sometimes misleading. Hence why we need to surface the slow abuse filter data within the Special:AbuseFilter interface. Otherwise the author may never know the filter has any performance issues.

Anyway, the run time average is still useful, so let's not remove it :) If it takes say, over 2 ms on average, that's indicative that it's probably not very efficient. I'm actually going to propose a guideline for this on enwiki.

Great! I'll clarify that bullet and post on wiki today.

MusikAnimal responded, but no other interest from other Edit filter managers.

Marking this ticket as closed.