Page MenuHomePhabricator

Instrument potential temporary account creations
Open, Needs TriagePublic

Description

In the parent task, we are trying to find out what reasonable rate limit is for temp account creations per day per IP. To solve T334623 and T358806 we are discussing creating temp accounts at the beginning of the edit process (https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1008530).

One thing we can do is instrument potential temp account creations to see what real world usage would be like (and to see if usage would be too high, per T334623#9587982).

EditPage::maybeActivateTempUserCreate is currently called in production even when temp account feature is switched off. In that method, we have access to the web request, and from there we can access the session ID of the request.

If we track how many unique session IDs are seen in ::maybeActivateTempUserCreate, we would have a pretty good answer as to how many temporary user accounts we could expect.[^0]

We could also bucket unique session IDs visits to ::maybeActivateTempUserCreate by IP address, to find out how many temp account creations we'd expect to see per IP address, to give us an answer to T342880: Decide what the rate limit should be for temporary account creations.

To keep track of the session IDs, we'd want a cache using a hashed form of the session ID as the key. If we don't find the session ID in the cache, we add an entry for it, and then we increment a counter in Grafana. (One challenge is that the session ID sensitive data so we'd need a secret + salt + hashing for the session ID cache get/set code.)

To track temp accounts per IP, whenever we see a session ID we haven't seen before, we'd increment a statsd counter using the IP address as the key. (That might be problematic; I am not sure how many unique keys one can add to statsd.)

As we are currently discussing when in the edit cycle to create a temp account, we can also instrument early and late in the request, to a get better handle on how many temp account creations might be denied by filters.

[0] Unlike the temp account editing paradigm, the session ID will change if the user visits a different wiki, or if they visit the mobile version of the site from the desktop (or vice versa), but that probably doesn't skew the numbers too much.

Event Timeline

kostajh renamed this task from Instrument temp account creations to Instrument potential temporary account creations.Mar 5 2024, 4:59 PM
kostajh updated the task description. (Show Details)
kostajh updated the task description. (Show Details)