Page MenuHomePhabricator

Determine initial wikis for production deployment of ext:StopForumSpam
Closed, ResolvedPublicSecurity

Description

Hey everybody-

As discussed during the recent stewards call on 2022-06-01, I'm creating this task and inviting stewards (and potentially other interested parties) to help determine the set of initial wikis for which to deploy ext:StopForumSpam in report-only mode. As folks may or may not know, ext:StopForumSpam has been running fairly well in enforce mode on beta since 2022-03-28 (T304111) and in report-only mode for many months prior to that. I'd like to suggest some open discussion on the following topics for two weeks (until 2022-06-17) and then move towards a vote for the initial wikis and deployment date.

Discussion topics

  1. Which production wikis make the most sense to include for this initial deployment? Group 0? Some combination of group 0 and group 1? Some other combination?
  2. How many wikis should we include within this first group? Somewhere between 5 and 15 sounds like a good, arbitrary number in that it is likely enough to get good statistics from ext:StopForumSpam's log data while not being too onerous to monitor and deal with emergent issues.
  3. I assume we would only want to include public wikis.
  4. Should we consider another round of wikis for a report-only test deployment if this one proves successful? Or just move it to report-only mode on all wikis?
  5. How long should we plan to keep ext:StopForumSpam in report-only mode? I think the likely constraints here are log data, which is 90 days on server and 30-ish days in logstash, I believe.
  6. I had planned to analyze the efficacy of ext:StopForumSpam with a few different metrics:
    1. The log data generated by ext:StopForumSpam, e.g. total blocks, total exemptions, total IPs not able to be determined.
    2. Simple survey data of stewards, admins, etc. on the initial group of wikis as to their experiences before and after ext:StopForumSpam was deployed in enforce mode. i.e. how bad is the spam/vandalism/etc on your wiki? etc.
    3. The volume of reports from blocked individuals (false positives) and the time/effort to manage these reports by admins, etc.

Is there anything else that should be considered from a data/analysis standpoint? This data would ostensibly be used to guide future deployments or moving to enforce mode for ext:StopForumSpam.

I understand that this is a lot to consider and that the important piece is to pick a set of wikis and determine a deployment date, so thanks in advance for any feedback!

Details

Risk Rating
Informational
Author Affiliation
WMF Technology Dept

Event Timeline

sbassett triaged this task as Medium priority.Jun 3 2022, 9:14 PM
sbassett moved this task from Backlog to In Progress on the user-sbassett board.
sbassett changed Risk Rating from N/A to Informational.
sbassett updated the task description. (Show Details)

My replies:

  1. a combination, with the idea of tackling both wikis where the spam is in language which is the one of the wiki and a foreign one. Also, wikis where spam is a huge % of overall edits (e.g. mediawikiwiki) and those where it is not (e.g. some major wiki), also some wikivoyage should be included given that wikivoyages actually welcome "spammish" behaviors.
  2. one for each combination of the above mentioned "dimensions" (EN/non-EN, spammed/non spammed, different projects) would be be maybe 8 or 12
  3. yup
  4. another round wouldn't be needed, theoretically, but it probably depends on the first round outcome
  5. 90, but let's reconsider it halfway, within, let's say, 45 days
  6. metrics seems to be fine

I'd like to understand how does it interact with abusefilter tho

@sbassett I think we could determine an initial set of trial wikis by actually looking at some global filters designed to stop spam, and add the top 10-ish wikis with the most filter hits for a period of e.g. 90 days. If my query at https://quarry.wmcloud.org/query/64424 for AF no. 72 is correct, I'd say we can safely add some of these, excluding Wikidata and ptwiki or other "large" wikis for now.

@sbassett I think we could determine an initial set of trial wikis by actually looking at some global filters designed to stop spam, and add the top 10-ish wikis with the most filter hits for a period of e.g. 90 days. If my query at https://quarry.wmcloud.org/query/64424 for AF no. 72 is correct, I'd say we can safely add some of these, excluding Wikidata and ptwiki or other "large" wikis for now.

I like this approach, if you don't think 72 wikis are too much to deal with. Though I suppose there likely wouldn't be too much to deal with in report-only mode, unless the entire extension were having unforeseen issues.

@sbassett I think we could determine an initial set of trial wikis by actually looking at some global filters designed to stop spam, and add the top 10-ish wikis with the most filter hits for a period of e.g. 90 days. If my query at https://quarry.wmcloud.org/query/64424 for AF no. 72 is correct, I'd say we can safely add some of these, excluding Wikidata and ptwiki or other "large" wikis for now.

I like this approach, if you don't think 72 wikis are too much to deal with. Though I suppose there likely wouldn't be too much to deal with in report-only mode, unless the entire extension were having unforeseen issues.

(I think Marco was talking about abuse filter number 72, not 72 wikis)

(I think Marco was talking about abuse filter number 72, not 72 wikis)

Ah yes, I clicked on the quarry link, saw the long list of results and the number 72. Turns out there appear to be around 190 wikis, and yes, I suppose we could just go with a subset of these and exclude large projects.

Hey all -

Just a reminder that I'd like to close discussion on this tomorrow, 2022-06-17. It seems like we have a reasonable path forward with @Vituzzu and @MarcoAurelio's suggestions above, so I'd like to put that to a vote (whatever that looks like) on this task early next week. Thanks.

Hello @sbassett:

Seeing there has been no further comments, I re-ran the query for the period of March 1 to June 30 to get some fresh data: https://quarry.wmcloud.org/query/65744

According to the results, I propose that we start the Production trial in the following wikis:

  • dkwikimedia
  • enwikibooks
  • eswikinews
  • ptwikiversity
  • azwikibooks
  • enwikiquote
  • tawiktionary
  • frwikinews
  • svwikiquote
  • enwikisource
  • eswikiversity
  • jawiktionary
  • ptwikibooks
  • plwikiquote

enwiki(books|quote) are probably the most active of the set, so we may want to remove these from the set for this initial test; considering that SFS, apparently, generates lots of logspam.

Note: It'd be great if someone with access could run the same query in production to see which afl_wiki is hitting 862 times the filter. It's displayed as NULL in the query. I suspect it's metawiki but it should say so, not NULL?

Regards.

I agree with Marco's proposed list for the initial trial.

[...]
Note: It'd be great if someone with access could run the same query in production to see which afl_wiki is hitting 862 times the filter. It's displayed as NULL in the query. I suspect it's metawiki but it should say so, not NULL?

Here you go (not full to not make the comment too long, but happy to provide full result if that is useful):

mysql:research@dbstore1003.eqiad.wmnet [metawiki]> SELECT
    ->   afl_wiki,
    ->   COUNT(*) AS "Hits"
    -> FROM
    ->   abuse_filter_log
    -> WHERE
    ->   afl_filter_id = 72
    ->   AND afl_timestamp >= 20220301000000
    ->   AND afl_timestamp <= 20220630235959
    -> GROUP BY
    ->   afl_wiki ASC
    -> LIMIT 10;
+---------------+------+
| afl_wiki      | Hits |
+---------------+------+
| NULL          |  862 |
| afwiki        |    1 |
| amiwiki       |    3 |
| amwiktionary  |    2 |
| apiportalwiki |    4 |
| arwikimedia   |    4 |
| arzwiki       |    3 |
| astwiktionary |    7 |
| aywiki        |    3 |
| aywiktionary  |    6 |
+---------------+------+
10 rows in set (0.780 sec)

mysql:research@dbstore1003.eqiad.wmnet [metawiki]>

FTR, the view in cloud should only filter out rows with afl_deleted=1.


I agree with Marco's proposed list for the initial trial.

LGTM too. FWIW, my still-in-force T250887 mitigations share several wikis from the list Marco prepared (P30812), so let's go with it.

Hey all - thanks for all of the feedback and additional research on this. I plan to mention this briefly on the stewards call tomorrow, but it sounds like we have some reasonable consensus around @MarcoAurelio's plan from T309900#8043532. If there isn't any strong dissent, I think we can pursue that list as the first batch of candidate wikis to deploy StopForumSpam in report-only mode this quarter.

Thanks @Urbanecm - No luck identifying which afl_wiki = NULL is then?

Thanks @Urbanecm - No luck identifying which afl_wiki = NULL is then?

Not from my side. Tbh, I don't see a way how that would even happen in AbuseFilter. Maybe @Daimona happens to know?

Thanks @Urbanecm - No luck identifying which afl_wiki = NULL is then?

Not from my side. Tbh, I don't see a way how that would even happen in AbuseFilter. Maybe @Daimona happens to know?

afl_wiki is NULL for local AbuseFilter hits (i.e., on meta) -- afl_wiki is always omitted for local hits. You may want to add afl_wiki is not null to the where conditions in the query above. (Also, ORDER BY is missing)

Thanks @Urbanecm - No luck identifying which afl_wiki = NULL is then?

Not from my side. Tbh, I don't see a way how that would even happen in AbuseFilter. Maybe @Daimona happens to know?

afl_wiki is NULL for local AbuseFilter hits (i.e., on meta) -- afl_wiki is always omitted for local hits. You may want to add afl_wiki is not null to the where conditions in the query above. (Also, ORDER BY is missing)

Oh, okay. Thanks for the clarification.

Per our meeting from a couple of days ago I understand we have an agreement to do an initial test with the following 10 small wikis:

azwikibooks
dkwikimedia
eswikinews
eswikiversity
frwikinews
jawiktionary
plwikiquote
ptwikibooks
ptwikiversity
svwikiquote

I've excluded en.* projects from this proposal as they're high traffic wikis, in comparison to the rest, and would probably cause too much Logstash spam.

@MarcoAurelio - yes, that list looks good! If there are no objections, I can make this task public and add this as part of the steps for T273220, which can then finally be scheduled, likely within the next month or two.

sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".Aug 17 2022, 2:30 AM
sbassett changed the edit policy from "Custom Policy" to "All Users".

Change 823789 had a related patch set uploaded (by SBassett; author: SBassett):

[operations/mediawiki-config@master] Enable StopForumSpam on initial candidate projects

https://gerrit.wikimedia.org/r/823789

Hey all -

I know this task is resolved, but I wanted to follow up here since everybody's already subbed :) ext:StopForumSpam has been enabled in report-only mode for about 4 months now on the projects mentioned within T309900#8065511. There is a logstash dashboard for it (for those who have access) with the latest stats being:

Total blocked over past 30 days: 1,036,622

Top 5 projects over the past 30 days:

project"blocked" actions
ja.wiktionary.org168,348
dk.wikimedia.org147,806
az.wikibooks.org140,327
es.wikinews.org131,705
sv.wikiquote.org104,705

I know that @Urbanecm had wanted to add this functionality T322263 and I still plan to work upon T316963, but would there be any serious blockers to putting ext:StopForumSpam into enforcing mode (e.g. non-read action blocking) on this initial set of projects within the near future? If there are, I would like to capture those here so as to analyze the future viability of the extension with Wikimedia production. Thanks!

For me, that’s hard to judge without seeing what would get blocked. I tried
reviewing the StopForumSpam logs, but I wasn’t able to find an actually
would-be blocked edit (edits that I found were blocked by a different
mechanism already).

Having a revision ID present (as T322263 suggests) would make it easy to
filter the logs, and review only the newly-blocked edits.

Others might have a different opinion, but I find it hard to evaluate the
performance of SFS with the currently available (meta)data.