Page MenuHomePhabricator

StopForumSpam maintenance script is not loading all IPs from stopforumspam.org deny list
Closed, InvalidPublic

Description

During today's StopForumSpam production config deploy (T273220), @Reedy and I noticed that the updateDenyList.php maintenance script, when run from mwdebug1002, didn't seem to populate wanCache with every IP value from the configured stopforumspam.org deny list (https://www.stopforumspam.com/downloads/listed_ip_90_ipv46_all.gz). It appears to have populated the cache with 66,454 IPs while there are 194,719 IPs within the stopforumspam.com deny list. So:

  1. Is there a bug in the maintenance script that's limiting the import in some way? Maybe verify on beta, where the extension has been deployed for several months.
  2. Is there some hard cap in production for cache imports that we're unaware of? Should these be batched in some way then?

Event Timeline

Looks like this is very likely due to the score threshold check during the import. Will confirm and then likely add some clarifying stats messages to the maint script to avoid confusion and concern.

Change 824253 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/extensions/StopForumSpam@master] DenyListManager: Info log number of skipped rows

https://gerrit.wikimedia.org/r/824253

Change 824253 merged by jenkins-bot:

[mediawiki/extensions/StopForumSpam@master] DenyListManager: Info log number of skipped rows

https://gerrit.wikimedia.org/r/824253

sbassett closed this task as Invalid.EditedAug 18 2022, 7:41 PM
sbassett assigned this task to Reedy.
sbassett moved this task from Backlog to Done on the MediaWiki-extensions-StopForumSpam board.

Setting as invalid as the initial assumptions about this issue were wrong. The maintenance script was actually working exactly as expected. Thanks to @Reedy though for adding some additional logging to remind us why this data discrepancy exists :)