Page MenuHomePhabricator

Enable AbuseFilter per-filter profiling on Portuguese Wikipedia & monitor if there is a performance impact
Closed, ResolvedPublic1 Estimated Story Points

Description

Enable on PTWP. We will then monitor the grafana logs to ensure the performance was not affected.

Event Timeline

Change 383858 had a related patch set uploaded (by Dmaza; owner: Dmaza):
[operations/mediawiki-config@master] Enable $wgAbuseFilterProfile on ptwiki

https://gerrit.wikimedia.org/r/383858

@Dereckson who can I ping to get a +1 from ops?

@kaldari I was told that since this "could" affect performance I need a +1 from someone at "ops" or I imagine, performace (?)
Who can I ping about this?

Change 383858 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable $wgAbuseFilterProfile on ptwiki

https://gerrit.wikimedia.org/r/383858

Mentioned in SAL (#wikimedia-operations) [2017-10-19T18:28:31Z] <thcipriani@tin> Synchronized wmf-config/abusefilter.php: SWAT: [[gerrit:383858|Enable $wgAbuseFilterProfile on ptwiki]] T177641 (duration: 00m 50s)

I'm going to calculate the percentage change before and after the release of this monitoring. 2017-10-19T18:28:31Z

— PTWP October to-date, hourly. I'll be crunching the numbers now. 🏋

Based on the data I exported from Grafana (attached to my previous comment) I calculated the average for all the hours before 2017-10-19 at 18:00. The data is in milliseconds

BeforeAfterDifference% Change
99th percentile109.61117.59+7.98+7.28%
75th percentile54.4755.791.32+2.4%

75p looks inconsequential to me. ~8milliseconds delay for the 99p doesn't appear to be that significant, but it could be attributed to the fact that User:Silent modified 31 filters (!!!) shortly before or since October 19. From a purely visual judgement, there are more spikes after October 18, which could be attributed to a bad filter(s):

Screen Shot 2017-10-26 at 1.12.02 PM.png (445×635 px, 72 KB)

My initial thoughts are that re-enabling the per-filter profiling didn't affect the AbuseFilter to a noticeable degree at 75p and the evidence is too fuzzy at 99p, so it is low-risk to re-enable profiling on another wiki, which has more stable filter modification, to monitor and calculate if there is a similar change. I propose this be English Wikipedia.

@dmaza @dbarratt @kaldari @MusikAnimal @He7d3r @aaron — your thoughts?

My initial thoughts are that re-enabling the per-filter profiling didn't affect the AbuseFilter to a noticeable degree at 75p and the evidence is too fuzzy at 99p, so it is low-risk to re-enable profiling on another wiki, which has more stable filter modification, to monitor and calculate if there is a similar change. I propose this be English Wikipedia.

Based on your findings, I agree. Also because enwiki really wants the per-filter profiling! :)

Thanks to the new logstash reports of slow filters, we've already made a lot of performance improvements. I don't think we'll end up worse than where we were before those improvements, and the per-filter profiling will only make this process easier.

if we enable this, I'll suggest to add the "Slow Filter" data to the per-filter profiling and stop logging the slow filters into Logstash. This way it is easy for filter editors to find such filters and improve them.

OK, let's do this. We'll take T179323 into the next AHT sprint.

if we enable this, I'll suggest to add the "Slow Filter" data to the per-filter profiling and stop logging the slow filters into Logstash. This way it is easy for filter editors to find such filters and improve them.

I've made T179604: Move AbuseFilter slow filters data from Logstash to per-filter profiling to do this.