Page MenuHomePhabricator

Investigate caching of serialisation of StopForumSpam IPSet
Closed, ResolvedPublic

Description

Benchmark: https://gerrit.wikimedia.org/r/824223 and https://gerrit.wikimedia.org/r/824228

Benchmark output:

Running PHP version 8.1.9 (x86_64) on Linux 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022

MediaWiki\StopForumSpam\DenyListManager::getIpDenyList('recache')
   count: 10
    rate:      0.5/s
   total: 18196.83ms
    mean:  1819.68ms
     max:  1844.38ms
  stddev:    17.51ms
Current memory usage: 62.00 MiB
   Peak memory usage: 80.09 MiB

MediaWiki\StopForumSpam\DenyListManager::getIpDenyList()
   count: 10
    rate:      1.6/s
   total:  6394.81ms
    mean:   639.48ms
     max:   650.78ms
  stddev:     4.34ms
Current memory usage: 64.00 MiB
   Peak memory usage: 80.09 MiB

MediaWiki\StopForumSpam\DenyListManager::isIpDenyListed('127.0.0.1')
   count: 10
    rate:      6.1/s
   total:  1651.56ms
    mean:   165.16ms
     max:  1651.48ms
  stddev:   522.24ms
Current memory usage: 188.00 MiB
   Peak memory usage: 192.00 MiB

Proposed improvement:
https://gerrit.wikimedia.org/r/c/mediawiki/extensions/StopForumSpam/+/822727

Running PHP version 8.1.9 (x86_64) on Linux 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet('recache')
   count: 10
    rate:      0.3/s
   total: 28868.37ms
    mean:  2886.84ms
     max:  2935.51ms
  stddev:    38.83ms
Current memory usage: 336.00 MiB
   Peak memory usage: 352.09 MiB

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet()
   count: 10
    rate: 460912.5/s
   total:     0.02ms
    mean:     0.00ms
     max:     0.01ms
  stddev:     0.00ms
Current memory usage: 336.00 MiB
   Peak memory usage: 352.09 MiB

MediaWiki\StopForumSpam\DenyListManager::isIpDenyListed('127.0.0.1')
   count: 10
    rate:  85948.9/s
   total:     0.12ms
    mean:     0.01ms
     max:     0.04ms
  stddev:     0.01ms
Current memory usage: 336.00 MiB
   Peak memory usage: 352.09 MiB

(Noted, tested on PHP 8.1; so possibly not completely representative on PHP 7.2 (or PHP 7.4) as for WMF prod).

Event Timeline

I'm curious why recache is seemingly quite a lot slower... And memory usage is increasing quite a bit...

Reedy renamed this task from Investigate serialisation of StopForumSpam IPSet to Investigate caching of serialisation of StopForumSpam IPSet.Aug 17 2022, 3:57 PM
Reedy added a subscriber: sbassett.

Change 824249 had a related patch set uploaded (by Addshore; author: Addshore):

[mediawiki/extensions/StopForumSpam@master] Benchmark using getIpDenyListSet

https://gerrit.wikimedia.org/r/824249

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/StopForumSpam/+/824246 PS3:

Running PHP version 8.1.9 (x86_64) on Linux 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet('recache')
   count: 10
    rate:      0.4/s
   total: 28388.12ms
    mean:  2838.81ms
     max:  3000.22ms
  stddev:    91.09ms
Current memory usage: 188.00 MiB
   Peak memory usage: 208.12 MiB

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet()
   count: 10
    rate: 665762.5/s
   total:     0.02ms
    mean:     0.00ms
     max:     0.01ms
  stddev:     0.00ms
Current memory usage: 188.00 MiB
   Peak memory usage: 208.12 MiB

MediaWiki\StopForumSpam\DenyListManager::isIpDenyListed('127.0.0.1')
   count: 10
    rate:  88862.4/s
   total:     0.11ms
    mean:     0.01ms
     max:     0.03ms
  stddev:     0.01ms
Current memory usage: 188.00 MiB
   Peak memory usage: 208.12 MiB

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/StopForumSpam/+/824249/2 PS2:

Running PHP version 8.1.9 (x86_64) on Linux 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet('recache')
   count: 10
    rate:      0.4/s
   total: 28488.23ms
    mean:  2848.82ms
     max:  2993.45ms
  stddev:    61.00ms
Current memory usage: 334.00 MiB
   Peak memory usage: 354.12 MiB

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet()
   count: 10
    rate: 511500.5/s
   total:     0.02ms
    mean:     0.00ms
     max:     0.01ms
  stddev:     0.00ms
Current memory usage: 334.00 MiB
   Peak memory usage: 354.12 MiB

MediaWiki\StopForumSpam\DenyListManager::isIpDenyListed('127.0.0.1')
   count: 10
    rate:  79437.6/s
   total:     0.13ms
    mean:     0.01ms
     max:     0.05ms
  stddev:     0.01ms
Current memory usage: 334.00 MiB
   Peak memory usage: 354.12 MiB

So that explains the memory difference

Benchmarking using a local list now, I believe previous benchmarks were using the internet download

benchmark of current master

Running PHP version 7.4.20 (x86_64) on Linux 5.10.16.3-microsoft-standard-WSL2 #1 SMP Fri Apr 2 22:23:49 UTC 2021

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet('recache')
   count: 10
    rate:      0.6/s
   total: 18011.41ms
    mean:  1801.14ms
     max:  2180.61ms
  stddev:   143.55ms
Current memory usage: 108.00 MiB
   Peak memory usage: 112.01 MiB

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet()
   count: 10
    rate: 146654.0/s
   total:     0.07ms
    mean:     0.01ms
     max:     0.02ms
  stddev:     0.00ms
Current memory usage: 108.00 MiB
   Peak memory usage: 112.01 MiB

MediaWiki\StopForumSpam\DenyListManager::isIpDenyListed('127.0.0.1')
   count: 10
    rate:   3292.5/s
   total:     3.04ms
    mean:     0.30ms
     max:     0.90ms
  stddev:     0.24ms
Current memory usage: 108.00 MiB
   Peak memory usage: 112.01 MiB

And with my alteration to caching

Running PHP version 7.4.20 (x86_64) on Linux 5.10.16.3-microsoft-standard-WSL2 #1 SMP Fri Apr 2 22:23:49 UTC 2021

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet('recache')
   count: 10
    rate:      0.5/s
   total: 19592.29ms
    mean:  1959.23ms
     max:  2405.75ms
  stddev:   199.78ms
Current memory usage: 184.00 MiB
   Peak memory usage: 186.00 MiB

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet()
   count: 10
    rate: 471270.1/s
   total:     0.02ms
    mean:     0.00ms
     max:     0.01ms
  stddev:     0.00ms
Current memory usage: 184.00 MiB
   Peak memory usage: 186.00 MiB

MediaWiki\StopForumSpam\DenyListManager::isIpDenyListed('127.0.0.1')
   count: 10
    rate:  30593.0/s
   total:     0.33ms
    mean:     0.03ms
     max:     0.10ms
  stddev:     0.03ms
Current memory usage: 184.00 MiB
   Peak memory usage: 186.00 MiB

Change 824249 merged by jenkins-bot:

[mediawiki/extensions/StopForumSpam@master] Benchmark using getIpDenyListSet

https://gerrit.wikimedia.org/r/824249

Running with the list used in WMF production, and 50 iterations

Current master:

Running PHP version 7.4.20 (x86_64) on Linux 5.10.16.3-microsoft-standard-WSL2 #1 SMP Fri Apr 2 22:23:49 UTC 2021

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet('recache')
   count: 50
    rate:      0.3/s
   total: 198666.90ms
    mean:  3973.34ms
     max:  6198.08ms
  stddev:   660.25ms
Current memory usage: 322.00 MiB
   Peak memory usage: 336.01 MiB

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet()
   count: 50
    rate: 183157.4/s
   total:     0.27ms
    mean:     0.01ms
     max:     0.07ms
  stddev:     0.01ms
Current memory usage: 322.00 MiB
   Peak memory usage: 336.01 MiB

MediaWiki\StopForumSpam\DenyListManager::isIpDenyListed('127.0.0.1')
   count: 50
    rate:  46613.7/s
   total:     1.07ms
    mean:     0.02ms
     max:     0.32ms
  stddev:     0.05ms
Current memory usage: 322.00 MiB
   Peak memory usage: 336.01 MiB

The alternative caching patch I made

Running PHP version 7.4.20 (x86_64) on Linux 5.10.16.3-microsoft-standard-WSL2 #1 SMP Fri Apr 2 22:23:49 UTC 2021

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet('recache')
   count: 50
    rate:      0.2/s
   total: 200493.45ms
    mean:  4009.87ms
     max:  6254.53ms
  stddev:   647.47ms
Current memory usage: 322.00 MiB
   Peak memory usage: 328.01 MiB

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet()
   count: 50
    rate: 493447.5/s
   total:     0.10ms
    mean:     0.00ms
     max:     0.01ms
  stddev:     0.00ms
Current memory usage: 322.00 MiB
   Peak memory usage: 328.01 MiB

MediaWiki\StopForumSpam\DenyListManager::isIpDenyListed('127.0.0.1')
   count: 50
    rate:  40912.1/s
   total:     1.22ms
    mean:     0.02ms
     max:     0.34ms
  stddev:     0.05ms
Current memory usage: 322.00 MiB
   Peak memory usage: 328.01 MiB

Master as of writing...

Running PHP version 8.1.9 (x86_64) on Linux 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet('recache')
   count: 10
    rate:      0.3/s
   total: 29219.55ms
    mean:  2921.95ms
     max:  3243.30ms
  stddev:   132.63ms
Current memory usage: 334.00 MiB
   Peak memory usage: 354.12 MiB

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet()
   count: 10
    rate: 616809.4/s
   total:     0.02ms
    mean:     0.00ms
     max:     0.01ms
  stddev:     0.00ms
Current memory usage: 334.00 MiB
   Peak memory usage: 354.12 MiB

MediaWiki\StopForumSpam\DenyListManager::isIpDenyListed('127.0.0.1')
   count: 10
    rate:  69672.8/s
   total:     0.14ms
    mean:     0.01ms
     max:     0.08ms
  stddev:     0.02ms
Current memory usage: 334.00 MiB
   Peak memory usage: 354.12 MiB

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/StopForumSpam/+/822727 PS12

Running PHP version 8.1.9 (x86_64) on Linux 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet('recache')
   count: 10
    rate:      0.3/s
   total: 29357.46ms
    mean:  2935.75ms
     max:  2989.53ms
  stddev:    31.30ms
Current memory usage: 332.00 MiB
   Peak memory usage: 352.12 MiB

MediaWiki\StopForumSpam\DenyListManager::getIpDenyListSet()
   count: 10
    rate: 487709.8/s
   total:     0.02ms
    mean:     0.00ms
     max:     0.01ms
  stddev:     0.00ms
Current memory usage: 332.00 MiB
   Peak memory usage: 352.12 MiB

MediaWiki\StopForumSpam\DenyListManager::isIpDenyListed('127.0.0.1')
   count: 10
    rate:  77672.3/s
   total:     0.13ms
    mean:     0.01ms
     max:     0.04ms
  stddev:     0.01ms
Current memory usage: 332.00 MiB
   Peak memory usage: 352.12 MiB

There doesn't seem to be much difference?

master

$ MW_INSTALL_PATH=/var/www/wiki/mediawiki/core php8.1 maintenance/updateDenyList.php
Starting update of SFS deny list in cache...
/var/www/wiki/mediawiki/extensions/StopForumSpam/includes/DenyListManager.php:166:
int(1033216)
Done! Loaded 66500 IPs.
Took 1.766371011734 seconds
$ git diff
diff --git a/includes/DenyListManager.php b/includes/DenyListManager.php
index 81f3b0e..867b0e1 100644
--- a/includes/DenyListManager.php
+++ b/includes/DenyListManager.php
@@ -163,6 +163,7 @@ class DenyListManager {
 
                        // Refill the local server cache if the list is not empty nor a placeholder
                        if ( $flatIpList !== '' ) {
+var_dump( strlen( $flatIpList ) );
                                $srvCache->set(
                                        $srvCacheKey,
                                        $flatIpList,

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/StopForumSpam/+/822727 PS12

$ MW_INSTALL_PATH=/var/www/wiki/mediawiki/core php8.1 maintenance/updateDenyList.php
Starting update of SFS deny list in cache...
/var/www/wiki/mediawiki/extensions/StopForumSpam/includes/DenyListManager.php:200:
int(3752062)
Done! Loaded 66500 IP addresses.
Set tree size: ipv4=566572 ipv6=178070
Took 2.7651309967041 seconds
$ git diff
diff --git a/includes/DenyListManager.php b/includes/DenyListManager.php
index 830497d..ead4b8b 100644
--- a/includes/DenyListManager.php
+++ b/includes/DenyListManager.php
@@ -197,6 +197,7 @@ class DenyListManager {
         * @param IPSet $set
         */
        private function setIPSetInSrvCache( IPSet $set ): void {
+var_dump( strlen( json_encode( $set->jsonSerialize() ) ) );
                $this->srvCache->set(
                        $this->getDenyListKey( $this->srvCache ),
                        $set->jsonSerialize(),
Before1033216
After3752062

So 3-4x the size in cache, in exchange for the speed improvement.
If desired I could probably rework this into a temporary switched mode? We could even try it in wmf production then too?

So 3-4x the size in cache, in exchange for the speed improvement.

There doesn't seem to be any noticeable (or significant) speed increase?

Yeah, today on my local machine I am not able to get the benchmark to show a speed improvement, however yesterday I did.

Looking at the profiling information we captured before trying to figure out why edits were slow in T309070#8097050

image.png (1×1 px, 176 KB)

This actually shows that IPSets are constructed 11 times during the call, i assume with the same list.
So perhaps reducing the number of times the IPSet is created would make more sense than caching the IPSet between calls?

Note, these profiles are for https://github.com/wbstack/mediawiki/commit/cfdc38ceacc4375e396f314629168089cf7f7995 which is 1.37 currently, not master etc.

Looking at this again I notice that SFS DenyListManager looks like it uses a singelton pattern, but it doesnt

	/**
	 * @todo use MediaWikiServices
	 * @return DenyListManager
	 */
	public static function singleton() {
		$services = MediaWikiServices::getInstance();

		$srvCache = $services->getLocalServerObjectCache();
		$wanCache = $services->getMainWANObjectCache();
		$http = $services->getHttpRequestFactory();
		$logger = LoggerFactory::getInstance( 'DenyList' );

		return new self( $http, $srvCache, $wanCache, $logger );
	}

So the in-process caching in DenyListManager mostly does nothing.

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/StopForumSpam/+/824415 Fix DenyListManager singleton [NEW] should bring a good improvement.

Change 824415 had a related patch set uploaded (by Addshore; author: Addshore):

[mediawiki/extensions/StopForumSpam@master] Fix DenyListManager singleton

https://gerrit.wikimedia.org/r/824415

Yeah, this seems to make sense. It's not going to make any difference to the Benchmarks, and that's fine. But I would definitely agree that it looks wrong, and this should be a welcome improvement.

Looking at the hooks alone, it would seem DenyListManager::singleton would get called potentially at least twice (AbuseFilter and permissions checking)... But we know permissions checking can happen many many many(!) times in one request, so that hook for onGetUserPermissionsErrorsExpensive almost certainly runs > 1 time

Change 824415 merged by jenkins-bot:

[mediawiki/extensions/StopForumSpam@master] Fix DenyListManager singleton

https://gerrit.wikimedia.org/r/824415

Change 824184 had a related patch set uploaded (by Reedy; author: Addshore):

[mediawiki/extensions/StopForumSpam@REL1_38] Fix DenyListManager singleton

https://gerrit.wikimedia.org/r/824184

Change 824185 had a related patch set uploaded (by Reedy; author: Addshore):

[mediawiki/extensions/StopForumSpam@REL1_37] Fix DenyListManager singleton

https://gerrit.wikimedia.org/r/824185

Change 824426 had a related patch set uploaded (by Reedy; author: Addshore):

[mediawiki/extensions/StopForumSpam@wmf/1.39.0-wmf.25] Fix DenyListManager singleton

https://gerrit.wikimedia.org/r/824426

Change 824185 merged by jenkins-bot:

[mediawiki/extensions/StopForumSpam@REL1_37] Fix DenyListManager singleton

https://gerrit.wikimedia.org/r/824185

Change 824426 merged by jenkins-bot:

[mediawiki/extensions/StopForumSpam@wmf/1.39.0-wmf.25] Fix DenyListManager singleton

https://gerrit.wikimedia.org/r/824426

Mentioned in SAL (#wikimedia-operations) [2022-08-18T10:45:55Z] <reedy@deploy1002> Synchronized php-1.39.0-wmf.25/extensions/StopForumSpam/includes/: T315447 (duration: 03m 36s)

The loop of 10 is about accurate for how many times this was called for wikibase.cloud it seems.
So I tweaked the benchmarker to actually call the singelton each time the benchmark was called.
See P32507

Without the singelton fix:

Running PHP version 7.4.20 (x86_64) on Linux 5.10.16.3-microsoft-standard-WSL2 #1 SMP Fri Apr 2 22:23:49 UTC 2021

MediaWiki\StopForumSpam\Benchmark::singleGetIpDenyListSet('recache')
   count: 10
    rate:      0.4/s
   total: 24243.76ms
    mean:  2424.38ms
     max:  2655.20ms
  stddev:   160.40ms
Current memory usage: 178.00 MiB
   Peak memory usage: 192.01 MiB

MediaWiki\StopForumSpam\Benchmark::singleGetIpDenyListSet()
   count: 10
    rate:      0.4/s
   total: 25930.97ms
    mean:  2593.10ms
     max:  2867.80ms
  stddev:   118.78ms
Current memory usage: 178.00 MiB
   Peak memory usage: 192.01 MiB

MediaWiki\StopForumSpam\Benchmark::singleIsIpDenyListed('127.0.0.1')
   count: 10
    rate:      0.4/s
   total: 25149.81ms
    mean:  2514.98ms
     max:  2940.96ms
  stddev:   173.85ms
Current memory usage: 178.00 MiB
   Peak memory usage: 192.01 MiB

With the singelton fix:

Running PHP version 7.4.20 (x86_64) on Linux 5.10.16.3-microsoft-standard-WSL2 #1 SMP Fri Apr 2 22:23:49 UTC 2021

MediaWiki\StopForumSpam\Benchmark::singleGetIpDenyListSet('recache')
   count: 10
    rate:      0.4/s
   total: 28555.19ms
    mean:  2855.52ms
     max:  3787.02ms
  stddev:   395.53ms
Current memory usage: 322.00 MiB
   Peak memory usage: 336.01 MiB

MediaWiki\StopForumSpam\Benchmark::singleGetIpDenyListSet()
   count: 10
    rate: 395689.1/s
   total:     0.03ms
    mean:     0.00ms
     max:     0.01ms
  stddev:     0.00ms
Current memory usage: 322.00 MiB
   Peak memory usage: 336.01 MiB

MediaWiki\StopForumSpam\Benchmark::singleIsIpDenyListed('127.0.0.1')
   count: 10
    rate:  85077.2/s
   total:     0.12ms
    mean:     0.01ms
     max:     0.04ms
  stddev:     0.01ms
Current memory usage: 322.00 MiB
   Peak memory usage: 336.01 MiB

And then with the other patch (caching IpSet) that I wrote applied on top

Running PHP version 7.4.20 (x86_64) on Linux 5.10.16.3-microsoft-standard-WSL2 #1 SMP Fri Apr 2 22:23:49 UTC 2021

MediaWiki\StopForumSpam\Benchmark::singleGetIpDenyListSet('recache')
   count: 10
    rate:      0.3/s
   total: 34175.41ms
    mean:  3417.54ms
     max:  4698.52ms
  stddev:   788.34ms
Current memory usage: 322.00 MiB
   Peak memory usage: 328.01 MiB

MediaWiki\StopForumSpam\Benchmark::singleGetIpDenyListSet()
   count: 10
    rate: 343795.4/s
   total:     0.03ms
    mean:     0.00ms
     max:     0.01ms
  stddev:     0.00ms
Current memory usage: 322.00 MiB
   Peak memory usage: 328.01 MiB

MediaWiki\StopForumSpam\Benchmark::singleIsIpDenyListed('127.0.0.1')
   count: 10
    rate:  73455.4/s
   total:     0.14ms
    mean:     0.01ms
     max:     0.06ms
  stddev:     0.01ms
Current memory usage: 322.00 MiB
   Peak memory usage: 328.01 MiB

I still potentially occasionally see some performance gain, but am not going to look into it any more.
I'll abandon that other patch and leave it for the future incase we come back to it

Change 824184 merged by jenkins-bot:

[mediawiki/extensions/StopForumSpam@REL1_38] Fix DenyListManager singleton

https://gerrit.wikimedia.org/r/824184

Addshore claimed this task.