Page MenuHomePhabricator

backfillLocalAccounts.php does not (always?) copy checkuser data
Open, Needs TriagePublicBUG REPORT

Description

As I said on discord: Accounts created by the backfill script show the wrong IP in loginwiki CU. The backfill script has little value to stewards as long as that's the case.

Other checkusers have reported this as well.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

No, the script doesn't try to copy client hints. It does try to copy IPs, it just (reportedly) doesn't always work.

It does try to copy IPs, it just (reportedly) doesn't always work.

I scrolled through loginwiki:Special:Log/MediaWikiAccountBackfiller to find a current LTA account to test if the issue still exists: Loginwiki CU on GeorgezReevezPersonz (-> en:WP:LTA/GRP) shows the localhost IP, according to the CU tool there are ~2,680 other users on that IP.
CU on metawiki (where this sockpuppet registered) shows their real IP which is different (and consistent with previous IPs from this LTA).

It does try to copy IPs, it just (reportedly) doesn't always work.

I scrolled through loginwiki:Special:Log/MediaWikiAccountBackfiller to find a current LTA account to test if the issue still exists: Loginwiki CU on GeorgezReevezPersonz (-> en:WP:LTA/GRP) shows the localhost IP, according to the CU tool there are ~2,680 other users on that IP.
CU on metawiki (where this sockpuppet registered) shows their real IP which is different (and consistent with previous IPs from this LTA).

Just encountered the issue again with https://login.wikimedia.org/wiki/Special:Contributions/Hide_on_Rosé_loses,_Timelash_winning

So apparently, when a global block is disabled on one wiki but remains active on Loginwiki, it could lead to this exact problem.

Just tested and confirmed this with https://meta.wikimedia.org/wiki/Special:CentralAuth/XXB_test_gjvcdg

Change #1225528 had a related patch set uploaded (by Arendpieter; author: Arendpieter):

[mediawiki/extensions/CheckUser@master] AccountCreationDetailsLookup: Include 'autocreate' log action in lookup

https://gerrit.wikimedia.org/r/1225528

Change #1225528 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] AccountCreationDetailsLookup: Include 'autocreate' log action in lookup

https://gerrit.wikimedia.org/r/1225528

@Arendpieter Thanks for the patch. Your change looks correct to me, but just to clarify, it doesn't explain any of the recent examples given in the comments here, right? Or am I missing something?

@Arendpieter Thanks for the patch. Your change looks correct to me, but just to clarify, it doesn't explain any of the recent examples given in the comments here, right? Or am I missing something?

As far as I understand, the home wiki is primarily determined by where the account was originally registered (attachedMethod = primary or new). However, if that information is missing, it falls back to the wiki with the most edits, which could be a wiki where the user was only autocreated.

backfillLocalAccounts.php runs on the loginwiki to create missing local accounts. It determines the user’s home wiki and then calls AccountCreationDetailsLookup->getAccountCreationIPAndUserAgent() against the home wiki’s database. getAccountCreationIPAndUserAgent() in turn calls getIPAndUserAgentFromDB(), which I patched. If backfillLocalAccounts.php can not retrieve data via AccountCreationDetailsLookup->getAccountCreationIPAndUserAgent(), it falls back to an incorrect IP.

But wait - you’re right. This cannot be the root cause, because even new accounts created by the backfill script show the wrong IP in loginwiki

Change #1226883 had a related patch set uploaded (by Arendpieter; author: Eugene Gvozdetsky):

[mediawiki/extensions/CentralAuth@master] Fix wrong IP logged in CheckUser for backfilled accounts

https://gerrit.wikimedia.org/r/1226883

Change #1226883 merged by jenkins-bot:

[mediawiki/extensions/CentralAuth@master] Fix wrong IP logged in CheckUser for backfilled accounts

https://gerrit.wikimedia.org/r/1226883

Interesting, I hope that helps. I don't know for sure if that will resolve the problem, but it seems plausible, and it shouldn't hurt. I guess we'll see once the patches are deployed (which should happen next week, see schedule: https://wikitech.wikimedia.org/wiki/Deployments/Train#Schedule).