Page MenuHomePhabricator

Account auto-creations caused by import of page caused actor table rows and CheckUser result rows but no associated user table row
Closed, ResolvedPublicSecurity

Description

Summary

There are several actor table rows on testwiki with a defined user ID that does not exist in the user table. These entries appear to have been created due to an import of a page to testwiki. This action additionally caused two cu_private_event rows to exist for the auto-creation of the accounts, with duplicated details.

Background

  • At 21:39 there were about 1000 warnings created regarding users having no central ID as shown in this logstash query
    • Selecting a random selection of the users mentioned in these log entries shows that they appear to have no account on testwiki
  • Running the query select * from actor where actor_user not in (select user_id from user); on testwiki shows about 400 actor table rows where the actor_ids are pretty much completely sequential, the actor_user column has a value which does not correspond to an ID in the user table, and the actor_name values appear to match the usernames in the logstash query
  • Running the query select * from cu_private_event where cupe_actor = X; (where X is an actor ID from the above query) showed two cu_private_event rows where both indicated an auto-creation of an account had occurred
  • The logstash query for other queries around the warning for no central ID finds that this appears to be related to the importing of a page to testwiki

Acceptance criteria

  • Ensure that orphaned actor and cu_private_event table rows are not created if the attempts to create the user rows fail

Related Objects

Event Timeline

Dreamy_Jazz added a subscriber: Tgr.

@Tgr, do you know if this could have anything to do with SUL3? The path that this took doesn't seem very clear based on looking at logstash, but it seems that this probably has something to do with MediaWiki-extensions-CentralAuth given that it's auto-creations on a call to api.php with the referrer being a different wiki.

Dreamy_Jazz renamed this task from 400 accounts already existing accounts auto-created on test.wikipedia.org using the same non-WMF IP address within the span of a minute to 400 already existing accounts auto-created on test.wikipedia.org using the same non-WMF IP address within the span of a minute.Feb 5 2025, 6:10 PM
Dreamy_Jazz updated the task description. (Show Details)
Dreamy_Jazz updated the task description. (Show Details)
Dreamy_Jazz renamed this task from 400 already existing accounts auto-created on test.wikipedia.org using the same non-WMF IP address within the span of a minute to 400 auto-creations on test.wikipedia.org using the same non-WMF IP address within the span of a minute which caused actor table rows but no user table rows.Feb 5 2025, 6:14 PM
Dreamy_Jazz updated the task description. (Show Details)
Dreamy_Jazz renamed this task from 400 auto-creations on test.wikipedia.org using the same non-WMF IP address within the span of a minute which caused actor table rows but no user table rows to 400 auto-creations on testwiki using the same non-WMF IP address within the span of a minute which caused actor table rows pointing to missing user table rows.Feb 5 2025, 6:22 PM
Dreamy_Jazz renamed this task from 400 auto-creations on testwiki using the same non-WMF IP address within the span of a minute which caused actor table rows pointing to missing user table rows to 400 auto-creations on testwiki using the same non-WMF IP address within the span of a minute causing actor table rows pointing to missing user table rows.
Dreamy_Jazz added a project: DBA.

Adding DBA as these rows will probably need to be fixed at some point.

Logs from one of the requests with repeating stuff filtered: https://logstash.wikimedia.org/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:[…]val:auto,query:(language:lucene,query:''),sort:!())

Note the reference to "Trondheim" being slow to parse. I think this is Jon Harald Soby importing articles from nowiki. The timing seems to match with https://test.wikipedia.org/wiki/Special:Log/import (although this import appears to have unsurprisingly failed).

That script uses AuthManager::autoCreateUser() so if it results in incomplete autocreations, there is something broken at a deeper point (T380500: CentralAuthUser returning outdated data after user creation, maybe?), but it would explain the burst of auto-creations, and the lack of client hint data (that's T381128: backfillLocalAccounts.php should fill Client Hints, on the back burner right now). All accounts having the same IP address / browser UA is probably a separate bug (no idea how it could happen though; that part of the code looks straightforward).

Thanks @mszabo for finding that. Then this isn't related to SUL3, so untagging that.

On second thought that doesn't make sense, that script is only running on meta & loginwiki.

I think this is Jon Harald Soby importing articles from nowiki.

Secondary errors caused by T383962: Special:Import revision duplication and unprefixed usernames then?

I think this is Jon Harald Soby importing articles from nowiki.

Secondary errors caused by T383962: Special:Import revision duplication and unprefixed usernames then?

Yeah, that seems to make sense.

Given that, I think this task can be unprotected when the security team has a chance to do that. I had protected this initially as I couldn't see why one IP address was causing all of these auto-creations in such a short space of time (and was concerned that maybe someone found out how to auto-create accounts on behalf of other users en-masse using a method not specifically intended).

Dreamy_Jazz renamed this task from 400 auto-creations on testwiki using the same non-WMF IP address within the span of a minute causing actor table rows pointing to missing user table rows to Account auto-creations caused by import of page caused actor table rows with no associated user table row.Feb 5 2025, 7:09 PM
Dreamy_Jazz renamed this task from Account auto-creations caused by import of page caused actor table rows with no associated user table row to Account auto-creations caused by import of page caused actor table rows and CheckUser result rows but no associated user table row.

Change #1117605 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] Avoid writing to CU tables if the main transaction round failed

https://gerrit.wikimedia.org/r/1117605

sbassett changed the task status from Open to In Progress.Feb 5 2025, 8:03 PM
sbassett edited projects, added SecTeam-Processed; removed Security, Security-Team.
sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".
sbassett changed the edit policy from "Custom Policy" to "All Users".
sbassett changed Risk Rating from N/A to Low.

Adding DBA as these rows will probably need to be fixed at some point.

if this task is about CU tables, they would be automatically purged after 90 days, right? Am I missing something super obvious? (apologies if I do)

Adding DBA as these rows will probably need to be fixed at some point.

if this task is about CU tables, they would be automatically purged after 90 days, right? Am I missing something super obvious? (apologies if I do)

The rows needing fixed are the associated actor table rows where the actor_user column is set to a user_id that does not exist in the user table. Maybe this would be automatically fixed if the user visits test.wikipedia.org to be auto-created, but I'm not sure if this would be the case.

The CheckUser tables will fix themselves after 90 days.

The rows needing fixed are the associated actor table rows where the actor_user column is set to a user_id that does not exist in the user table. Maybe this would be automatically fixed if the user visits test.wikipedia.org to be auto-created, but I'm not sure if this would be the case.

That's more related to T383962: Special:Import revision duplication and unprefixed usernames than this task. I don't think it will work - the actor table entry will block the creation of another actor record with the same username, but the existing record is linked to a user ID that will never be used (new users just get the next autoincrement value). ActorStore::acquireActorId() doesn't even have reasonable error handling for the case when there is an actor for the given username, but with an ID mismatch.
And if we are extra unlucky, whetever prevented the creation of the user record did not autoincrement so the ID just gets assigned to the next user who signs up.

Change #1118117 had a related patch set uploaded (by Máté Szabó; author: Máté Szabó):

[mediawiki/extensions/CheckUser@master] phpunit: Test CheckUser behavior on transaction rollback

https://gerrit.wikimedia.org/r/1118117

Change #1118117 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] phpunit: Test CheckUser behavior on transaction rollback

https://gerrit.wikimedia.org/r/1118117

@Dreamy_Jazz Do you want me to delete those rows in production? It's testwiki, anything goes.

@Dreamy_Jazz Do you want me to delete those rows in production? It's testwiki, anything goes.

Might be worth doing that. AFAICS it would cause issues if any of these users open testwiki?

@Dreamy_Jazz Do you want me to delete those rows in production? It's testwiki, anything goes.

Might be worth doing that. AFAICS it would cause issues if any of these users open testwiki?

if the users exist globally, we can also force create them locally via createLocalAccount.php in CentralAuth? I don't know if it would work.

@Dreamy_Jazz Do you want me to delete those rows in production? It's testwiki, anything goes.

Might be worth doing that. AFAICS it would cause issues if any of these users open testwiki?

if the users exist globally, we can also force create them locally via createLocalAccount.php in CentralAuth? I don't know if it would work.

It should be fine to delete the actor rows, I think the issues would only occur if we left the rows in testwiki.

I wrote this horrible thing to get rid of the rows:

for i in $(seq 1 410);
do
   id=$(db-mysql db1223 testwiki -ss -e "select actor_id from actor where actor_user not in (select user_id from user) limit 1;")
   if [ "${id:0:1}" == "1" ]; then
       db-mysql db1223 testwiki -ss -e "delete from actor where actor_id = $id limit 1;"
       echo $id
   fi
   sleep 1
done

It's empty now:

cumin2024@db1223.eqiad.wmnet[testwiki]> select * from actor where actor_user not in (select user_id from user);
Empty set (0.144 sec)

Anything else needed from your friendly neighborhood DBAs?

I wrote this horrible thing to get rid of the rows:

for i in $(seq 1 410);
do
   id=$(db-mysql db1223 testwiki -ss -e "select actor_id from actor where actor_user not in (select user_id from user) limit 1;")
   if [ "${id:0:1}" == "1" ]; then
       db-mysql db1223 testwiki -ss -e "delete from actor where actor_id = $id limit 1;"
       echo $id
   fi
   sleep 1
done

It's empty now:

cumin2024@db1223.eqiad.wmnet[testwiki]> select * from actor where actor_user not in (select user_id from user);
Empty set (0.144 sec)

Anything else needed from your friendly neighborhood DBAs?

I don't think so. Thanks for that!

Given that, I think we can resolve this task as:

  1. The code causing these extra actor rows has been fixed in e209ae5a07cade91fb7a1d5320e9e6c6a8168bc9
  2. No more of these rows appear to exist elsewhere