Page MenuHomePhabricator

Define where to add code that needs to run after a new central user has been created
Closed, ResolvedPublic

Description

Context

In T376266: Allow to assign an experiment variant to existing account holders, the Growth-Team switched to assigning A/B testing variants on the fly (via conditional defaults) based on the user ID, rather than by assigning variants. The relevant code is this:

public function shouldAssignBucket( UserIdentity $userIdentity, string $experimentName, array $args ): bool {
	$centralIdLookupCallback = $this->centralIdLookupCallback;
	/** @var CentralIdLookup */
	$centralIdLookup = $centralIdLookupCallback();
	$userCentralId = $centralIdLookup->centralIdFromLocalUser( $userIdentity );
	$sample = $this->getSample( $userCentralId, $experimentName );
	if ( $this->isInSample( $args[0], $sample ) ) {
		return true;
	}
	return false;
}

This worked fine in practice. However, we also have code in onLocalUserCreated that reports number of users in each variant to Grafana. Those counts were not matching the reality, as discovered in T379682: Growth KPI Grafana dashboard claims no A/B testing happens at pilot wikis when using global assignment.

This was happening because the shouldAssignBucket() method was failing silently when executed from within onLocalUserCreated(): in there, CentralIdLookup is not guaranteed to work, because the central user is also created in onLocalUserCreated (by MediaWiki-extensions-CentralAuth), and it is not guaranteed that CentralAuth's handler runs before Growth's (or the other way around).

Problem

As far as I can tell, the only place where we can run code on account creation is onLocalUserCreated. However, the CentralAuth part is not guaranteed to be available at that point. Together with the "failing silently" part, this caused a lot of confusion.

I think there are aspects CentralAuth might consider doing in a more friendly way, such as:

  • CentralIdLookup could fail if called on an user that does not have a central ID, rather than silently returning zero
  • A hook to finalise account creation could be added, central account creation moved there

For now, Growth workarounded the problem by wrapping the monitoring call in a deferred update (so it runs at the end of the request, when the central user exists), but this is not really a nice solution. Curious to hear what CentralAuth maintainers think about this.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

...the central user is also created in onLocalUserCreated

No, it's supposed to be created in CentralAuthPrimaryAuthenticationProvider::beginPrimaryAccountCreation(); and for existing central accounts, the local user is supposed to be attached in CentralAuthPrimaryAuthenticationProvider::autoCreatedAccount() which runs just before the hook. There have been changes to the latter recently (e.g. rECAUffd23bf4c620: Check session provider when autocreating) so maybe we broke something; but more likely it's just DB lag. You can pass a recency flag to CentralIdLookup methods.

Huh, interesting! Thanks for the info @Tgr. I might be missing something about how CentralAuth is working, but I was referring to UserCreationHookHandler::onLocalUserCreated. That handler calls CentralAuthUser::register, which is documented to "Register a new, not previously existing, central user account". If the central account is supposed to be created in the authentication provider, then isn't the creation code in UserCreationHookHandler redundant?

That being said, it is perfectly possible passing a recency flag to CentralIdLookup would solve the problem as well. It would be more problematic, as READ_LATEST would be useless in the vast majority of calls, but could be done. I'd still say CentralIdLookup::centralIdFromLocalUser probably shouldn't just return 0 if it can't find the ID in question, but I admit that's not CentralAuth's fault, as CA is following what MW prescribes (and documents)...

Yeah the control flow is baroque (and I'm not sure I understand it well enough to change it) - I think UserCreationHookHandler was created to catch (theoretical) situations where a non-CentralAuth authentication provider registers or autocreates users. And now temp users land there as well.

It would be more problematic, as READ_LATEST would be useless in the vast majority of calls, but could be done.

Why? It's run right after a DB change, seems like the right thing to do to me.

It would be more problematic, as READ_LATEST would be useless in the vast majority of calls, but could be done.

Why? It's run right after a DB change, seems like the right thing to do to me.

Not all the time though. shouldAssignBucket is called on every logged-in pageview, and it is also called from onLocalUserCreated. READ_LATEST is only reasonable if it is called from onLocalUserCreated, but passing info about the caller through all the layers is fairly tricky (the full route is onLocalUserCreated -> UserOptionsLookup -> ConditionalDefaultsLookup -> onConditionalDefaultOptionsAddCondition -> shouldAssignBucket).

You could check $loadBalancer->hasOrMadeRecentPrimaryChanges(). Not sure how much difference that would make. CentralAuth has code like that in a few places.

Change #1121784 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/CentralAuth@master] [WIP] CentralAuthIdLookup: Reuse cached object on single-value lookup

https://gerrit.wikimedia.org/r/1121784

Change #1121789 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/CentralAuth@master] [WIP] CentralAuthIdLookup: Use primary DB after writes

https://gerrit.wikimedia.org/r/1121789

Change #1121784 merged by jenkins-bot:

[mediawiki/extensions/CentralAuth@master] CentralAuthIdLookup: Reuse cached object on single-value lookup

https://gerrit.wikimedia.org/r/1121784

Change #1121789 merged by jenkins-bot:

[mediawiki/extensions/CentralAuth@master] CentralAuthIdLookup: Use primary DB after writes

https://gerrit.wikimedia.org/r/1121789

Assuming these patches work as expected, is there anything left to do here?

IMO if the patches do fix the GrowthExperiments cohort handling logic, this can be closed as invalid or declined.

Change #1124781 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/CentralAuth@wmf/1.44.0-wmf.18] CentralAuthIdLookup: Reuse cached object on single-value lookup

https://gerrit.wikimedia.org/r/1124781

Change #1124782 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/CentralAuth@wmf/1.44.0-wmf.18] CentralAuthIdLookup: Use primary DB after writes

https://gerrit.wikimedia.org/r/1124782

Change #1124781 merged by jenkins-bot:

[mediawiki/extensions/CentralAuth@wmf/1.44.0-wmf.18] CentralAuthIdLookup: Reuse cached object on single-value lookup

https://gerrit.wikimedia.org/r/1124781

Change #1124782 merged by jenkins-bot:

[mediawiki/extensions/CentralAuth@wmf/1.44.0-wmf.18] CentralAuthIdLookup: Use primary DB after writes

https://gerrit.wikimedia.org/r/1124782

Mentioned in SAL (#wikimedia-operations) [2025-03-05T16:19:50Z] <tgr@deploy2002> Started scap sync-world: Backport for [[gerrit:1124781|CentralAuthIdLookup: Reuse cached object on single-value lookup (T379909 T380500 T387106)]], [[gerrit:1124782|CentralAuthIdLookup: Use primary DB after writes (T379909 T380500)]], [[gerrit:1124783|Use UserOptionsManager for SUL3 rollout flag (T384549)]], [[gerrit:1124784|Make SUL3 global preference optional and simplify logic]], [[gerrit:1124785|Ad

Mentioned in SAL (#wikimedia-operations) [2025-03-05T16:22:50Z] <tgr@deploy2002> tgr: Backport for [[gerrit:1124781|CentralAuthIdLookup: Reuse cached object on single-value lookup (T379909 T380500 T387106)]], [[gerrit:1124782|CentralAuthIdLookup: Use primary DB after writes (T379909 T380500)]], [[gerrit:1124783|Use UserOptionsManager for SUL3 rollout flag (T384549)]], [[gerrit:1124784|Make SUL3 global preference optional and simplify logic]], [[gerrit:1124785|Add passive central do

Mentioned in SAL (#wikimedia-operations) [2025-03-05T16:54:47Z] <tgr@deploy2002> Finished scap sync-world: Backport for [[gerrit:1124781|CentralAuthIdLookup: Reuse cached object on single-value lookup (T379909 T380500 T387106)]], [[gerrit:1124782|CentralAuthIdLookup: Use primary DB after writes (T379909 T380500)]], [[gerrit:1124783|Use UserOptionsManager for SUL3 rollout flag (T384549)]], [[gerrit:1124784|Make SUL3 global preference optional and simplify logic]], [[gerrit:1124785|A

I looked at this again.

First, let me recap and expand what was already discussed above:

  • CentralAuth's onLocalUserCreated() hook is not relevant. It may create a central account and attach the local user to it, but this should only happen if a non-CentralAuth AuthenticationProvider authenticated the local user; on Wikimedia wikis this should be impossible ($wgCentralAuthStrict is true).
  • CentralAuthPrimaryAuthenticationProvider creates the central account in beginPrimaryAccountCreation() (ref), but it only attaches the local user to it in autoCreatedAccount() (ref) (during autocreation) or in finishAccountCreation() (ref) (during normal creation).
  • During autocreation, AuthManager calls autoCreatedAccount() on the provider before calling the onLocalUserCreated() hook. (ref) During normal creation, it's different, AuthManager calls the onLocalUserCreated() hook before calling finishAccountCreation() on the provider. (ref)
  • CentralAuthIdLookup::centralIdFromLocalUser() will only return an ID if local user is attached to the central account. (ref1) (ref2)

Now, what does this mean for us?

If I'm reading it right, it means that in a onLocalUserCreated() hook handler, when called with $autocreated==false, CentralAuthIdLookup::centralIdFromLocalUser() will always return 0, despite the fact that the global account already exists. This is annoying, but correct.

Instead, you may use CentralAuthIdLookup::centralIdFromName(), which will return the central account's ID even when the local user is not attached. (ref) On Wikimedia wikis, you can assume that the central account belongs to the local user regardless ($wgCentralAuthStrict is true), and that it will be attached shortly; or you can call CentralAuthIdLookup::isOwned() to double-check that (ref1) (ref2).

If all that sounds awful, and you need to know for sure whether a local account is attached or not regardless of config settings, then you may define your own AuthenticationProvider and implement the postAccountCreation() method and do your work there, instead of the onLocalUserCreated() hook. This doesn't seem too unpleasant, I even found an existing one in our codebase (ref) – added in T346327, so you may already be familiar with this approach.

@Urbanecm_WMF Sorry about the wall of text, but I couldn't make it any shorter. Does it answer your question?

(We should probably document some of these things better in LocalUserCreatedHook and in CentralIdLookup.)

matmarex claimed this task.

Let's continue on T379682.