Page MenuHomePhabricator

Special:Import revision duplication and unprefixed usernames
Open, Needs TriagePublicBUG REPORT

Description

If Special:Import is executed with the checkbox "Assign edits to local users where the named user exists locally" checked, the importer is meant to assign edits to existing users instead of giving them a prefix.

When importing from an SUL wiki like Incubator, all users exist globally, so no imported revisions should have a prefixed name. However, some recent imports from Incubator show many prefixed names.

This page history on tigwiki shows an example of duplicated revisions where one of each pair has a local name, and the other has a prefixed name. @Amire80 confirmed that he followed the documented import procedure in that case.

Ordering the 8 revisions on that page by rev_id shows that the 3 revisions with prefixed usernames were created first, with an import null edit at 2024-12-13T01:15:52, and then the 3 revisions with unprefixed usernames were created, with an import null edit 47 seconds later at 2024-12-13T01:16:39.

It looks like we need to

  • Deduplicate requests at the application level, say by putting a unique token on the Special:Import form
  • Make ExternalUserNames and CentralAuth's ImportHandleUnknownUser hook be more robust to races and/or replication lag
  • Clean up a lot of damaged wiki histories

Searching for possibly affected wikis:

$ for w in tigwiki idwikivoyage tcywikisource tcywiktionary rskwiki nrwiki tddwiki annwiki ibawiki bclwikisource shnwikinews gorwikiquote moswiki kgewiki madwiktionary bdrwiki cswikivoyage btmwiki dtpwiki bewwiki kuswiki mywikisource iglwiki kaawiktionary mswikisource kawikisource; do echo -n "| $w | ";sql $w -- -B --skip-column-names -e 'select count(*) from actor where actor_name like '\''incubator>%'\'; done
tigwiki74
idwikivoyage0
tcywikisource0
tcywiktionary0
rskwiki20
nrwiki60
tddwiki8
annwiki24
ibawiki43
bclwikisource0
shnwikinews0
gorwikiquote0
moswiki0
kgewiki0
madwiktionary1
bdrwiki1
cswikivoyage0
btmwiki0
dtpwiki0
bewwiki0
kuswiki0
mywikisource0
iglwiki13
kaawiktionary0
mswikisource0
kawikisource0

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

xff.log on tigwiki shows 28 POST requests to Special:Import between 01:00 and 01:31, all with different request IDs.

Regarding the username autocreation race condition:

  • ExternalUserNames::applyPrefix() loads the user from the replica. If the user is not there, it calls the hook. If the hook returns false (indicating success) it reloads the user from the primary. If the user is there in the primary, it returns the unprefixed username.
  • CentralAuth's ImportHookHandler calls CentralAuthUtilityService::autoCreateUser() which calls AuthManager::autoCreateUser(). If AuthManager returns a "good" (warning-free) status, ImportHookHandler will return false.
  • Race 1: If the user is created during the replication lag period, AuthManager will find it on line 1901 and return a userexists warning. ImportHandleHookHandler will return null and ExternalUserNames will apply the prefix.
  • Race 2: If another thread is in the region of AuthManager line 2003-2140 for the same username, a usernameinprogress error will be returned and ExternalUserNames will apply the prefix.
  • Race 3: If another thread has returned from AuthManager, releasing the app-level lock, but the transaction has not yet been committed, User::addToDatabase() will wait for the row lock and then return a userexists error. AuthManager returns a userexists warning and ExternalUserNames will apply the prefix.
  • Various errors in AuthManager cause it to write to AUTOCREATE_BLOCKLIST in the importing user's session, breaking the current import and all future imports by that user until the session expires.

Proposal:

  • If AuthManager returns userexists, return false from the hook so that the importer will reload from the primary DB.
  • If AuthManager returns usernameinprogress, enter a WaitConditionLoop until the lock is released.
  • The hook should pass $performer to AuthManager to suppress the use of the session and to use elevated rights.

Slightly related, mostly unrelated. Special:Import can use a lot of validation improvements to avoid issues like T123313#2232274

  • If AuthManager returns userexists, return false from the hook so that the importer will reload from the primary DB.

On an aside, the return status of AuthManager::autoCreateUser() is documented as "Good if user was created, Ok if user already existed, otherwise Fatal" (which would be nicer than relying on a specific message key) but looking at the code that doesn't seem reliably true, so either we should fix that or document the error key as something that can be relied on.

  • If AuthManager returns usernameinprogress, enter a WaitConditionLoop until the lock is released.

Inside AuthManager, that's straightforward (just set a larger timeout for getScopedLock()) but would affect all autocreations so might have unwanted side effects. Outside AuthManager, I guess that would mean repeatedly retrying the autocreate until it returns something other than usernameinprogress? That seems fine; the only side effect I can think of is getting throttled more easily if there is a rate limit on the autocreateaccount action (since throttle counters are increased before trying to get the lock), but rate-limiting that doesn't seem very sensible.

  • The hook should pass $performer to AuthManager to suppress the use of the session and to use elevated rights.

Or maybe an UltimateAuthority? The typical permission-related reason for AuthManager to fail is that the username is disallowed on that wiki by some sort of anti-abuse mechanism (like English Wikipedia not wanting usernames with all-Chinese characters). Should the outcome of an import depend on the extent to which the importing user personally is allowed to circumvent anti-abuse filters?

In general some permission checks during import are probably important for security reasons, like when importing site scripts; but specifically for user autocreation, ignoring checks seems fine.

Related: T69936: Provide a useable way to bypass abuse filters

  • Race 3: If another thread has returned from AuthManager, releasing the app-level lock, but the transaction has not yet been committed, User::addToDatabase() will wait for the row lock and then return a userexists error. AuthManager returns a userexists warning and ExternalUserNames will apply the prefix.

There is a transaction around the whole Special:Import request, so this race window is open for minutes.

The incubator import procedure says that if a request times out, you can retry if no pages were created. But a CDN timeout often happens while PHP is still running. You can't see the pages on the wiki until the transaction is committed, so it looks as if the request has fully failed.

Anecdotally I've seem the behavior of the import appearing to time out but still eventually succeeding several times when importing files that are several megabytes in size during my project to export Flow boards to wikitext with history.

That project isn't affected by this task because the source and destination wikis are the same so the accounts already exist, and I only try to import one page at a time so there's no potential for double imports.

When Amir imported kncwiki, I had him enable debug logs. Looking at these logs now, one interesting thing is that it sleeps for 5 seconds during the insertion of each revision. The delay is not explained by a log entry, but occurs every time ExternalStoreAccess::insert() is called, after LoadBalancer::reuseOrOpenConnectionForNewRef() logs "reusing connection for 0/kncwiki" but before any other log entry (the next log entry varies).

We don't have logs from the Database object that is being returned. We have log entries from MysqlReplicationReporter but they are not associated with this issue.

The time delta between log entries is usually a few milliseconds greater than 5 seconds. Occasionally there is a delay of 10 seconds. I don't know what the cause is.

When Amir imported kncwiki, I had him enable debug logs. Looking at these logs now, one interesting thing is that it sleeps for 5 seconds during the insertion of each revision. The delay is not explained by a log entry, but occurs every time ExternalStoreAccess::insert() is called, after LoadBalancer::reuseOrOpenConnectionForNewRef() logs "reusing connection for 0/kncwiki" but before any other log entry (the next log entry varies).

I think this is a logging artifact. Logs are broken by ExternalStore::insert() and start working again after 5 seconds.