[Timebox: 4hrs] Investigate spam account creation on a Wiki
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Evelien_WMDE
	Jan 17 2023, 2:36 PM

Description

When we resolved https://phabricator.wikimedia.org/T322665, we realized there was a lot of spam accounts being created on Wiki's even with a Captcha in place and an up-to-date MW version.
We've seen reports of close to 2k spam accounts created per Wiki per month (see comments). This is big overhead for the Wikibase admins to manage their Wikibase and community with legitimate account requests.

Helpful context around the Captcha: https://www.google.com/recaptcha/admin/site/577576057
Perhaps adding oil to the flame: T301243

Question we want to answer: Are people creating these spam accounts able to go around the Captcha somehow (is the Captcha not working?) or is this done manually and if so, how do we stop it from happening?

AC:

Stress test the Captcha
Check during a random timeframe (occurs any day) how many legitimate vs. spam accounts were created
Define a way to prevent the spam accounts from getting through

Related Objects

Mentioned In: T342143: 🟧 Use questy captcha to prevent spam users creating (or requesting) wiki accounts
T322665: ConfirmAccount enabled but doesn't block createaccount on furry.wikibase.cloud instance
Mentioned Here: T342142: Use hcaptcha to prevent spam users creating (or requesting) wiki accounts
T342143: 🟧 Use questy captcha to prevent spam users creating (or requesting) wiki accounts
T335769: Fix and Stress test the Sign Up Captcha
T301243: Wikibase Bug: Unclear error message "save has failed"
T322665: ConfirmAccount enabled but doesn't block createaccount on furry.wikibase.cloud instance

Event Timeline

Evelien_WMDE created this task.Jan 17 2023, 2:36 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 17 2023, 2:36 PM

Evelien_WMDE mentioned this in T322665: ConfirmAccount enabled but doesn't block createaccount on furry.wikibase.cloud instance.Jan 17 2023, 2:38 PM

GreenReaper subscribed.Jan 17 2023, 3:26 PM

Evelien_WMDE updated the task description. (Show Details)May 4 2023, 2:01 PM

@GreenReaper Hey there, we created this ticket based on a heads up from you that you were receiving a lot of spam account requests (see linked ticket as well). Since there's a Captcha there and now that we're up to date with the latest MW versions again, just wanted to check in if you're still experiencing this issue? Thanks!

It appears so. There are ~50 spam requests in the last day. Many of them are marked as having confirmed email addresses. In total there were 1751 in the last month (showing as open requests), which suggests the rate has not changed recently.

I am not actively running it as a public wiki with expected new users but if I were it would be tricky to review them.

Evelien_WMDE updated the task description. (Show Details)May 8 2023, 7:28 AM

Evelien_WMDE updated the task description. (Show Details)Jun 8 2023, 1:08 PM

Evelien_WMDE renamed this task from StopForumSpam improvements to cover account creation request spam to [Timebox: 4hrs] Investigate spam account creation on a Wiki.Jun 8 2023, 1:13 PM

Evelien_WMDE updated the task description. (Show Details)

Evelien_WMDE moved this task from Product prioritized backlog to Ready to Pick Up on the Wikibase Cloud board.

Evelien_WMDE updated the task description. (Show Details)Jun 9 2023, 7:59 AM

@Evelien_WMDE Should we remove "Stress test the captcha" from the AC now that https://phabricator.wikimedia.org/T335769 exists?

@Fring These are 2 different things: T335769 refers to the sign up on https://www.wikibase.cloud/ as a Wikibase owner, whereas this ticket refers to the sign up on a Wikibase as a contributor. The first one is preventive in the context of the open beta, whereas this ticket is an already known problem

Got it, thanks for explaining @Evelien_WMDE

When looking at the reCAPTCHA console there is a warning (unfortunately I will see the German copy no matter what)

Wir haben festgestellt, dass von deiner Website weniger als 50 % der mit reCAPTCHA übergebenen Lösungen überprüft werden. Dies könnte auf ein Problem bei der Integration von reCAPTCHA hinweisen. Weitere Informationen findest du auf unserer Entwicklerwebsite.

Google Translate:

We found that your site verifies less than 50% of the solutions submitted with reCAPTCHA. This could indicate a problem with the integration of reCAPTCHA. Visit our developer website for more information.

which sounds as if there might be a problem with our integration. Has anyone ever seen this before, and if yes investigated the problem somehow? This is the documentation mentioned in the error message https://developers.google.com/recaptcha/docs/verify#api-request

We discussed this in the technical refinement session, starting to talk about a solution. A first step would be to figure out if the Recaptcha is even working, e.g. is it working on Wiki account creation or not. However, even if Recaptcha is being used correctly, spam bots have already broken it, so finding a mitigating solution is a secondary objective; e.g. limiting IPs. A potential other solution is to try another config, e.g. hcaptcha.

reCAPTCHA Enterprise purports to offer more, but I get a sneaking feeling it might also be a way to track users now Google's been told it can't do it through other means, especially if you add it to other pages like they want.

Fring claimed this task.Jul 17 2023, 8:44 AM

Fring moved this task from Ready to Pick Up to Kanban board Q3 2023 on the Wikibase Cloud board.

Fring edited projects, added Wikibase Cloud (Kanban board Q3 2023); removed Wikibase Cloud.

Fring moved this task from To do to Doing on the Wikibase Cloud (Kanban board Q3 2023) board.

I looked into this, but haven't really found anything obvious:

As of now the Captcha injected by ConfirmEdit seems to work as intended. I could get myself blocked by giving bad answers as well as pass by giving correct answers
- I could not find any obvious loopholes, neither from testing, nor from reading the code in the extension
- As opposed to the message shown in the ReCaptcha console, the default codepath does validate the response
Spam accounts are indeed being created a lot for furry.wikibase.cloud:
- 562 in June 2023, 201 of them with a confirmed email adress
- 952 account request in total as of now
- Those 952 accounts have been created from 790 distinct IPs using Real World user agents on Mac and Windows
Up until this January, a lot of Spam accounts have been created successfully and somehow passed the queue. i.e. on furry.wikibase.cloud, there are currently 4714 user accounts
- why that behavior stopped is unclear to me, maybe it's related to the MediaWiki 1.39 update https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ConfirmAccount/+/879996/

Unfortunately I don't have any brilliant ideas about how to improve the situation other than trying a different Captcha and hope that it confuses the bad sign ups that are scripts for a little while.

Fring removed Fring as the assignee of this task.Jul 17 2023, 1:30 PM

Fring moved this task from Doing to In Review on the Wikibase Cloud (Kanban board Q3 2023) board.

Regarding "why the behaviour stopped", we discovered that Special:CreateAccount was not being disabled which was fixed in January.

Unfortunately my experience is that much spam is created from botnets and proxies, although some server ranges might be blockable. As you say a different captcha may help. It might also be worth reaching out to WMF to see if they have any stuff related to e.g. SpamBlacklist email regexes or DNSBLs.

It might also be worth reaching out to WMF to see if they have any stuff related to e.g. SpamBlacklist email regexes or DNSBLs.

This sounds reasonable to me. From what I understand, ConfirmEdit does work as intended in our setup, it's just not powerful enough.

Tarrow claimed this task.Jul 18 2023, 12:53 PM

I agree there is no super obvious path forwards. Looks like we are following the general best practices from https://www.mediawiki.org/wiki/Manual:Combating_spam etc. We could certainly change the captcha to questy (T342143) or hcaptcha(T342142). I've made tickets for these options for us to discuss with @Evelien_WMDE

Tarrow removed Tarrow as the assignee of this task.Jul 18 2023, 4:09 PM

Tarrow subscribed.

Tarrow closed this task as Resolved.Jul 31 2023, 10:46 AM

Tarrow claimed this task.

Evelien_WMDE mentioned this in T342143: 🟧 Use questy captcha to prevent spam users creating (or requesting) wiki accounts.Aug 9 2023, 11:52 AM

[Timebox: 4hrs] Investigate spam account creation on a WikiClosed, ResolvedPublicActions

Description

Related Objects

Event Timeline

[Timebox: 4hrs] Investigate spam account creation on a Wiki
Closed, ResolvedPublic
Actions