Decide what the rate limit should be for temporary account creations
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Tchanders
	Jul 27 2023, 1:46 PM

Description

Background

The number of accounts created from the same IP address is rate limited, using the $wgAccountCreationThrottle config.

The number of temporary accounts created is not currently rate limited. This task is to determine whether we should apply a rate limit, and if so what that limit should be.

Why we might want rate limiting

It can be relatively easy to block cookies for a given website, in which case each edit is assigned to a new temporary account, even within a browser session.

Q: How much do we expect this to happen, and how much would this need to happen in order to create database storage problems or difficulties for patrollers?
A:

KH: we should take the worst case scenario as the guideline for this, which is scripted temporary account creation, and that is currently bounded by the limit of 8 edits per minute per IP. We can work out a more constricted rate limit in T357771: Analyze how many distinct devices edit per day from a given IP address
KH: we can also consider a tiered system for the rate limits; we have a hard limit that denies the edit/temp account creation entirely after it's been tripped, and a soft limit that prompts for CAPTCHA completion. That would allow for limiting scripted abuse while still allowing good faith users from a particular IP to make an initial edit.

Why we might not want rate limiting

Some IPs are shared by a large number of people, e.g. covering a large geographical area. Rate limiting could significantly harm the ability of people using these IPs to edit.

Q: If we decide to implement rate limiting, how can we monitor to what extent this is happening?
A: T357763: [Epic] Create a temporary accounts initiative Grafana dashboard would provide insight into this

Other considerations

Q: Is the rate limit on anon edits still applied correctly with temp account creation enabled?
A: yes, the $wgRateLimits['edit']['ip'] applies to edits that result in temp acocunt creation.

Related Objects
Search...

Status	Assigned	Task
		Restricted Task
Resolved	kostajh	T294511 2021 Security Team wikireplicas audit
Declined	None	T284948 Raw IPs of logged-out users disclosed in wiki-replicas
In Progress	Niharika	T324492 Temporary accounts - MVP
Open	kostajh	T357776 [Epic] Mitigate abilities to abuse temporary accounts
Resolved	kostajh	T357777 Implement more restrictive rate limit for temporary account creation
Resolved	kostajh	T342880 Decide what the rate limit should be for temporary account creations
Resolved	jwang	T357771 Analyze how many distinct devices edit per day from a given IP address
Declined	kostajh	T359194 Instrument potential temporary account creations
Resolved	kostajh	T357778 Provide ability to require logged-out users to complete a CAPTCHA on temporary account creations in certain circumstances
Resolved	kostajh	T357779 Provide ability to require temporary account users to complete a CAPTCHA in certain circumstances
Declined	None	T357782 Place temporary blocks on temporary account creation for IPs with high rate of reverts
Open	None	T357802 Prompt user to create a regular account after temp account creation rate limit trip
Open	mszabo	T364705 Provide AbuseFilter condition for revertrisk threshold
Resolved	achou	T356102 Allow calling revertrisk language agnostic and revert risk multilingual APIs in a pre-save context
Open	kostajh	T354599 [EPIC] Provide IP reputation variables in AbuseFilter
Open	kostajh	T360067 Deploy Extension:IPReputation
Resolved	sbassett	T360070 Application Security Review Request : Extension:IPReputation
Open	None	T380454 Provide ip_reputation_tunnel_operator AbuseFilter
Open	None	T364710 Create AbuseFilter condition for "likely a bot"
Resolved	MunizaA	T352839 RevertRisk model readiness for temporary accounts
Open	None	T374920 Coordinate AutoModerator rollout with pilot wiki deployments

Event Timeline

Tchanders created this task.Jul 27 2023, 1:46 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 27 2023, 1:46 PM

Tgr subscribed.Jul 29 2023, 3:42 AM

matmarex mentioned this in T343101: Decide whether acquiring temporary account usernames should be rate limited.Jul 30 2023, 7:44 PM

kostajh subscribed.Aug 17 2023, 7:23 AM

Tchanders updated the task description. (Show Details)Aug 17 2023, 12:05 PM

Niharika claimed this task.Sep 6 2023, 6:18 PM

Tchanders added a subscriber: Ladsgroup.Sep 11 2023, 4:01 PM

@Madalina In our roadmap, this is assigned to Product for Nov/Dec 2023. Did @Niharika share any updates with you?

It can be relatively easy to block cookies for a given website, in which case each edit is assigned to a new temporary account, even within a browser session.

If we don't rate limit temporary account creations per IP address, then we run the risk of a malicious actor using a script to rapidly edit pages. Each edit would create a new temporary account.

I would propose that we start out with some limits per IP address, and we could scale this up or down as needed.

It's true that users who are behind an IP address that serves thousands of people could be negatively impacted. This is somewhat mitigated in that those users would be able to create an account and edit.

In T342880#9444534, @kostajh wrote:

It can be relatively easy to block cookies for a given website, in which case each edit is assigned to a new temporary account, even within a browser session.

If we don't rate limit temporary account creations per IP address, then we run the risk of a malicious actor using a script to rapidly edit pages. Each edit would create a new temporary account.

I would propose that we start out with some limits per IP address, and we could scale this up or down as needed.

It's true that users who are behind an IP address that serves thousands of people could be negatively impacted. This is somewhat mitigated in that those users would be able to create an account and edit.

Alternatively, we could think about showing a CAPTCHA to gate temporary account creation, but that brings other issues.

In T342880#9444547, @kostajh wrote:

Alternatively, we could think about showing a CAPTCHA to gate temporary account creation, but that brings other issues.

That should be fairly easy to do since we already show a captcha sometimes during editing. (Not sure if there are other actions which should create a temp account but don't support captcha? Flow?) It would make temp account creation almost as hard as normal account creation (where the captcha is the most disruptive step, given how easy password management is in today's browsers) but then the only alternative that would be to send the user to normal account creation, so probably still a win?

That said,

If we don't rate limit temporary account creations per IP address, then we run the risk of a malicious actor using a script to rapidly edit pages. Each edit would create a new temporary account.

is that actually a problem? I mean, a bigger problem then someone making lots of scripted edits (which is not harder today than it would be with temp accounts)? The DB cost of temp accounts isn't huge.

In T342880#9448591, @Tgr wrote:

If we don't rate limit temporary account creations per IP address, then we run the risk of a malicious actor using a script to rapidly edit pages. Each edit would create a new temporary account.

is that actually a problem? I mean, a bigger problem then someone making lots of scripted edits (which is not harder today than it would be with temp accounts)? The DB cost of temp accounts isn't huge.

I don't know. It's not just DB cost, but also all the hooks that are invoked and other knock-on effects of creating an account.

For scripted edits, I assume (maybe wrongly) at some point we start showing a CAPTCHA? At some point, scripted edits would hit a rate limit for an IP address. But if an attacker wants to create a few thousand temp accounts, and store the cookies for later use, they'll be able to use those thousands of temp accounts multiplied by whatever rate limit is set for temp user edits per hour to generate a lot of vandalism. If those scripted edits are then not all made from the same IP address, they'll be difficult to clean up.

In T342880#9448815, @kostajh wrote:

For scripted edits, I assume (maybe wrongly) at some point we start showing a CAPTCHA? At some point, scripted edits would hit a rate limit for an IP address.

I don't think we throttle edits via captcha; we show it when someone tries to add an URL. We do use per-IP throttling. But any defenses we have against mass edits applies to mass edits which would create temp users just as well. And from the other direction, preventing temp account creation means preventing the edit since there is not much else we could do with it. So I don't think there's a meaningful difference between an anonymous edit throttle (doable via $wgRateLimits['edit']) and a temp account creation throttle, unless there is some different, less conspicuous way of creating temp accounts.

If those scripted edits are then not all made from the same IP address, they'll be difficult to clean up.

Not very different from making the original temp-account-creating edits from different IP addresses, I think?

Tchanders added subscribers: STran, Dreamy_Jazz.Jan 11 2024, 6:44 AM

So I don't think there's a meaningful difference between an anonymous edit throttle (doable via $wgRateLimits['edit']) and a temp account creation throttle, unless there is some different, less conspicuous way of creating temp accounts.

In theory temp accounts could be created from any action listed in $wgAutoCreateTempUser['actions'], though we only currently support edit. We're investigating whether temp accounts may need to be created on actions other than edit (i.e. which workflows try to create an IP actor) in T349219.

I suppose one would hope that any action that an anon user can do is already rate limited. Though the actual rate limit on temp account creations would be roughly the sum of the rate limits for all the different $wgAutoCreateTempUser['actions'], so if the list of actions is huge or one of the actions has a very high rate limit then it would be worth rate limiting temp account creations separately.

That's probably an unlikely scenario for production, but perhaps there's enough uncertainty here that it's worth supporting temp account creation rate limits.

My concern here is that if we don't apply a rate limit it would be very easy for someone to get a new temporary account username per edit.

This is an issue because it makes it hard to see a pattern of abuse for users without the ability to temporary account IPs. One temporary account that does lots of vandalism even after warnings is easy to block, but multiple temporary accounts performing this vandalism could be seen as seperate people and then not appropriately blocked.

For example:

With rate limiting, the user uses one (or a few) temporary account usernames to make one vandalism edit to pages A, B, C, D, E, F, G, H, .... The small number of usernames that were used makes it easier to link them together and block them.
With a rate limit equal to the rate limit for editing, then a new temporary account could be created per edit (via a clear of the session). Therefore none of the accounts have more than one edit, meaning on their own they are not blockable as no pattern of abuse is seen. Because there is no overlap, it also makes it difficult to link each temporary user together and treat them as one user for the purposes of blocking. I would argue that the community may not access the IP for a one edit temporary account that could just be warned, and may only access the IP (and find other temporary account usernames) once one of the temporary accounts shows a pattern of abuse. As such, this is not a way around this. Furthermore, only a subset of all users who report vandalism are going to have the ability to see temporary account IP addresses.

FWIW, currently we have a 8/min throttle on anonymous edits (a bit higher on Commons) and 6/day on regular account creations. What would be the new settings? You can create a temp account without solving a captcha, so presumably we'd still want to limit all temp accounts using the same IP to max 8 edits per min (even if technically they aren't "anonymous" anymore), right?

Note: there are multiple 8/min throttle: see https://www.mediawiki.org/wiki/Manual:$wgRateLimits - If the user have no noratelimit user right and is not using IP listed in $wgRateLimitsExcludedIPs:

all users (registered or not) without the autoconfirmed user right on single IP are limited to 8 edits/minute across all WMF wikis (the ip part)
each registered user without the autoconfirmed user right (with any IP) is limited to 8 edits/minute in one site (the newbie part)

In Commons:

all users (registered or not) without the autoconfirmed user right can not edit if there are 120 non-autoconfirmed edits across all WMF wikis on this IP in latest 5 minutes
user without the autoconfirmed user right can not edit if there are 120 edits for this user (with any IP) in Commons in latest 5 minutes

In T342880#9457756, @Tgr wrote:

You can create a temp account without solving a captcha

At the moment, yes, but I'd like to consider requiring a CAPTCHA at least for the first edit that would result in temp account creation.

One issue with the implementation in the patch for T342770: Can't edit any page via visual editor while not logged into an account or a temporary account is that getUserForPermission returns a temp user placeholder name (*Unregistered *) for rate limit checks:

ApiEditPage.php

private function getUserForPermissions() {
	$user = $this->getUser();
	if ( $this->tempUserCreator->shouldAutoCreate( $user, 'edit' ) ) {
		return $this->userFactory->newUnsavedTempUser(
			$this->tempUserCreator->getStashedName( $this->getRequest()->getSession() )
		);
	}
	return $user;
}

Which means that multiple attempts for account creation from a single IP will be checked against the rate limits for *Unregistered *, but we probably want to first check the rate limit for editing by IP address.

In T342880#9464666, @kostajh wrote:
One issue with the implementation in the patch for T342770: Can't edit any page via visual editor while not logged into an account or a temporary account is that getUserForPermission returns a temp user placeholder name (*Unregistered *) for rate limit checks:
ApiEditPage.php
private function getUserForPermissions() {
	$user = $this->getUser();
	if ( $this->tempUserCreator->shouldAutoCreate( $user, 'edit' ) ) {
		return $this->userFactory->newUnsavedTempUser(
			$this->tempUserCreator->getStashedName( $this->getRequest()->getSession() )
		);
	}
	return $user;
}
Which means that multiple attempts for account creation from a single IP will be checked against the rate limits for *Unregistered *, but we probably want to first check the rate limit for editing by IP address.

Summarizing a discussion @kostajh and I just had about this:

Example log seen when the rate limit is tripped in this way:

ratelimit.INFO: User::pingLimiter: User tripped rate limit {"action":"edit","limit":8,"period":60,"count":8,"key":"ip","name":"*Unregistered *","ip":"127.0.0.1"}

The rate limit that's being tripped is $wgRateLimits['edit']['ip'], which is the limit for all IP/newbie/temp users editing from the same IP address - so that's working as it should be
Seeing *Unregistered * in the logs is a little confusing
Since T336187#9341749, we expect permissions checks to be performed against * rather than a placeholder temp account, so we may not need to do this any more. Filed as T355210

kostajh claimed this task.Feb 16 2024, 12:52 PM

kostajh added a parent task: T324492: Temporary accounts - MVP.Feb 16 2024, 1:15 PM

kostajh updated the task description. (Show Details)Feb 16 2024, 1:39 PM

kostajh updated the task description. (Show Details)

kostajh mentioned this in T357763: [Epic] Create a temporary accounts initiative Grafana dashboard.Feb 16 2024, 1:41 PM

kostajh updated the task description. (Show Details)Feb 16 2024, 1:45 PM

kostajh renamed this task from Decide whether temporary account creations should be rate limited to Decide whether temporary account creations should have more restrictive rate limit than default IP edit rate limit.Feb 16 2024, 2:17 PM

kostajh renamed this task from Decide whether temporary account creations should have more restrictive rate limit than default IP edit rate limit to Decide what the rate limit should be for temporary account creations.

kostajh mentioned this in T357777: Implement more restrictive rate limit for temporary account creation.

kostajh added a parent task: T357777: Implement more restrictive rate limit for temporary account creation.

My proposal is:

implement a more restrictive rate limit for temp account creation. That should be informed by T357771: Analyze how many distinct devices edit per day from a given IP address
follow through with subtasks of T357776: [Epic] Mitigate abilities to abuse temporary accounts which propose some other mitigations for temp account abuse, that rate limits solve on their own

My 2c that you can easily ignore (I think you mentioned it): I really like this idea: First edit that would trigger a new temp account should require a captcha, the subsequent edits by the same temp account shouldn't (unless the usual case of adding external links which is automatically enforced by ConfirmEdit extension). It would put up a decent-ish barrier against large-scale abuse.

In T342880#9551469, @Ladsgroup wrote:

First edit that would trigger a new temp account should require a captcha,

Thanks. That is covered in T357778: Provide ability to require logged-out users to complete a CAPTCHA on temporary account creations in certain circumstances

the subsequent edits by the same temp account shouldn't (unless the usual case of adding external links which is automatically enforced by ConfirmEdit extension)

There's a proposal in T357779: Provide ability to require temporary account users to complete a CAPTCHA in certain circumstances to sometimes require temp account users to fill out a CAPTCHA on subsequent edits after creating an account.

Marking as stalled on T357771: Analyze how many distinct devices edit per day from a given IP address

kostajh mentioned this in T359194: Instrument potential temporary account creations.Mar 5 2024, 4:56 PM

In T342880#9594563, @kostajh wrote:

Marking as stalled on T357771: Analyze how many distinct devices edit per day from a given IP address

https://phabricator.wikimedia.org/T357771#9648033 found that p99 values are 3 per IP per day and p75 is 2 per IP per day. The current value of 6 account creations per IP per day (controlled via $wgAccountCreationThrottle) is probably fine, so we can resolve this task. This is something we could monitor as part of rollout. (cc @Madalina @Niharika @jwang to think about in health metrics for rollout.)

kostajh closed subtask T357771: Analyze how many distinct devices edit per day from a given IP address as Resolved.Jun 28 2024, 2:15 PM

kostajh mentioned this in T375500: Temp accounts Grafana Dashboard: Temp account creation rate limit trips.Sep 24 2024, 12:38 PM

Decide what the rate limit should be for temporary account creationsClosed, ResolvedPublicActions