Page MenuHomePhabricator

Better defaults for preventing spam
Open, MediumPublic

Description

A vanilla MediaWiki install allows anonymous users to create pages and talk pages without any restrictions or rate limiting, while registered users are instantly promoted to autoconfirmed status because $wgAutoConfirmAge and $wgAutoConfirmCount both default to 0. There are no restrictions, rate limits, blacklists, or even CAPTCHAs for new or anonymous users to insert external links, create pages. or complete any other action. I can go find a fresh vanilla install right now via any search engine and begin creating thousands of pages containing a backlink to my website or some malicious URL - all without any difficulty.

The bundled anti-spam extensions include ConfirmEdit, Nuke, SpamBlacklist, and TitleBlacklist. Of those, ConfirmEdit seems to be the only one which can effectively prevent spam without too much hassle. Yet the default CAPTCHA is trivially bypassable. There's no easy built-in way to find a spammer's IP or implement regular expression filters - users have to go and install CheckUser and AbuseFilter themselves.

There are bots for sale which are designed specifically for spamming MediaWiki, and several Fiverr gigs offer thousands of MediaWiki backlinks for $5. All of that spam is possible because sysadmins aren't installing and using anti-spam extensions and because the default settings grant too much trust to new users.

MediaWiki spam is not just a problem for the people running spammed installs, but rather for sysadmins of other websites too. I've seen a large number of cases where the spammer submits a link to a MediaWiki site as a means to hide the real target of the spam link. They'd spam for innocentSpammedWebsite.com on my website while innocentSpammedWebsite.com contains a link to the actual destination they wanted to spam. My domain blacklists didn't contain innocentSpammedWebsite.com, but had actualTarget.com instead.

Can stricter default settings be considered to protect the wider Internet community from shady blackhat SEO spammers abusing MediaWiki? Spam is an increasing problem on the Internet, and MediaWiki's open nature makes it an ideal target for spammers.

Possible ideas:

  • Bundle common anti-spam extensions with the standard MediaWiki release
  • Improve the defaults for core and already bundled antispam extensions (which are: ConfirmEdit, Nuke, SpamBlacklist, TitleBlacklist)
    • Disallow anonymous users and new users from creating pages and talk pages containing external links.
    • Make QuestyCaptcha the default CAPTCHA and have sysadmins type out a question and answer pair during the initial setup. A CAPTCHA itself can be optional, but guiding sysadmins to QuestyCaptcha should reduce a lot of spam.

See also:

Event Timeline

Thanks for reporting this. The general discussion seems to be more suited for a mailing list like mediawiki-l@ or wikitech-l@: https://lists.wikimedia.org/mailman/listinfo
This task mixes several very different aspects so the task itself might be 'unfixable'...

In general, https://www.mediawiki.org/wiki/Manual:Combating_spam is linked from the main page when installing. If you have recommendations given the current state, feel free to edit that wiki page.

Yet the default CAPTCHA is trivially bypassable.

There are likely existing tasks about this. Hard to say as "trivially bypassable" could mean anything.

There's no easy built-in way to find a spammer's IP or implement regular expression filters - users have to go and install CheckUser and AbuseFilter themselves.
I believe integrating AbsenteeLandlord into core
Make QuestyCaptcha the default CAPTCHA

Please file separate tasks for separate requests - see https://mediawiki.org/wiki/How_to_report_a_bug

Also see T100706: Revamp anti-spamming strategies and improve UX for the general topic.

Can stricter default settings be considered

Adding MediaWiki-Installer but as written above this needs broader discussion first...

I am going to close this task as it is currently not actionable and needs to be broken down into specific steps, and as I see this sufficiently covered by T100706. Please feel free to reopen if you disagree and elaborate about specific tasks. Thanks!

Tgr subscribed.

I think it's fine to have planning / "goal" tasks without clear implementation steps as long as they clearly spell out a problem to solve, which this task does (and T100706 doesn't, but it seems to be more about Wikimedia-specific solutions while this task is about MediaWiki defaults).

Tgr updated the task description. (Show Details)
Tgr updated the task description. (Show Details)

Thanks for the merge @Tgr

My following observations are based on looking at those wikis that have been spammed to WMF wikis

  • admins accounts can have zero edits
  • the range of wikis is from 1.21 through to 1.30
  • spam at those wikis can be either IP or user accounts

I have started labelling those wikis that I have blacklisted through their COIBot Xwiki investigation page with the text "spmabot infested"
https://meta.wikimedia.org/w/index.php?search=spambot+infested&prefix=User%3ACOIBot%2FXWiki&title=Special%3ASearch&profile=advanced&fulltext=1&advancedSearch-current=%7B%7D&ns2=1

My concerns come from:

  1. the general abuse of mediawiki application and the negative consequences for the product
  2. the resulting spambot abuse which results aimed back at us from an active target and the general encouragement our design weaknesses give to those who write spambot applications
Reedy renamed this task from Saner defaults for preventing spam to Better defaults for preventing spam.Jan 30 2024, 3:57 PM