Page MenuHomePhabricator

Use IPSet in TorBlock?
Open, Needs TriagePublic

Description

TorBlock is still using in_array() to check if an IP is an exit node.

return in_array( IPUtils::sanitizeIP( $ip ), self::getExitNodes() );

Depending on the size of the dataset (2136 items as of late September 2022), performance wise, it would probably make some sense to use IPSet instead...

Event Timeline

Change 834707 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/extensions/TorBlock@master] Benchmark: Compare in_array() and IPSet

https://gerrit.wikimedia.org/r/834707

Using an already setup IPSet, it's definitely quicker...

in_array( '127.0.0.1', array( 2136 ) )
   count: 10
    rate:  17916.7/s
   total:     0.56ms
    mean:     0.06ms
     max:     0.07ms
  stddev:     0.00ms
Current memory usage: 80.00 MiB
   Peak memory usage: 87.91 MiB

Wikimedia\IPSet::match( '127.0.0.1' )
   count: 10
    rate: 123725.8/s
   total:     0.08ms
    mean:     0.01ms
     max:     0.03ms
  stddev:     0.01ms
Current memory usage: 80.00 MiB
   Peak memory usage: 87.91 MiB

Now that T303765: Create a way to store an IPSet has happened, we can just store the json serialized IPSet into cache, rather than the PHP array

Change 834707 merged by jenkins-bot:

[mediawiki/extensions/TorBlock@master] Benchmark: Compare in_array() and IPSet

https://gerrit.wikimedia.org/r/834707