As discussed during the recent stewards call on 2022-06-01, I'm creating this task and inviting stewards (and potentially other interested parties) to help determine the set of initial wikis for which to deploy ext:StopForumSpam in report-only mode. As folks may or may not know, ext:StopForumSpam has been running fairly well in enforce mode on beta since 2022-03-28 (T304111) and in report-only mode for many months prior to that. I'd like to suggest some open discussion on the following topics for two weeks (until 2022-06-17) and then move towards a vote for the initial wikis and deployment date.
- Which production wikis make the most sense to include for this initial deployment? Group 0? Some combination of group 0 and group 1? Some other combination?
- How many wikis should we include within this first group? Somewhere between 5 and 15 sounds like a good, arbitrary number in that it is likely enough to get good statistics from ext:StopForumSpam's log data while not being too onerous to monitor and deal with emergent issues.
- I assume we would only want to include public wikis.
- Should we consider another round of wikis for a report-only test deployment if this one proves successful? Or just move it to report-only mode on all wikis?
- How long should we plan to keep ext:StopForumSpam in report-only mode? I think the likely constraints here are log data, which is 90 days on server and 30-ish days in logstash, I believe.
- I had planned to analyze the efficacy of ext:StopForumSpam with a few different metrics:
- The log data generated by ext:StopForumSpam, e.g. total blocks, total exemptions, total IPs not able to be determined.
- Simple survey data of stewards, admins, etc. on the initial group of wikis as to their experiences before and after ext:StopForumSpam was deployed in enforce mode. i.e. how bad is the spam/vandalism/etc on your wiki? etc.
- The volume of reports from blocked individuals (false positives) and the time/effort to manage these reports by admins, etc.
Is there anything else that should be considered from a data/analysis standpoint? This data would ostensibly be used to guide future deployments or moving to enforce mode for ext:StopForumSpam.
I understand that this is a lot to consider and that the important piece is to pick a set of wikis and determine a deployment date, so thanks in advance for any feedback!