Page MenuHomePhabricator

Account creation attempt on mobile Wikipedia domain leads user to desktop Special:CentralLogin/complete, often in logged-out state
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Go to https://en.m.wikipedia.org/wiki/Special:CreateAccount
  • Fill in the form and submit

What happens?:

  • I am on https://en.wikipedia.org/wiki/Special:CentralLogin/complete?token={redacted} (note: desktop site, not mobile)
  • I see an error message: "No active login attempt is in progress for your session." (centralauth-error-nologinattempt)
  • I am not logged-in, but the account has been created

image.png (1×1 px, 269 KB)

What should have happened instead?:

  • I should have been on https://en.m.wikipedia.org/wiki/Special:WelcomeSurvey and I should be logged-in

Other information (browser name/version, screenshots, etc.):

HTTP request workflow on eswiki:

I don't know Special:CentralLogin at all, but I assume the tokens for /start and /complete steps should be the same, and here they are not. On desktop, the token for /start/ and /complete also varies, so that is not related to the problem.

Looking at the code, the place where the error is thrown is here in extensions/CentralAuth/includes/Special/SpecialCentralLogin.php:

// Get the user's current login attempt information
$attempt = $request->getSessionData( $skey );
if ( !isset( $attempt['secret'] ) ) {
	$this->showError( 'centralauth-error-nologinattempt' );
	return;
}

Event Timeline

kostajh triaged this task as Unbreak Now! priority.Apr 20 2023, 11:46 AM
kostajh updated the task description. (Show Details)
kostajh added a project: Mobile.
kostajh updated the task description. (Show Details)

Wild guess, might this have something to do with the datacenter switchover (T327920)? It doesn't seem that likely given that the creation workflow is functional on desktop, but bringing it up as a possibility.

Otherwise, I think the next step would be to start looking through changes to CentralAuth and MediaWiki core hooks / session code.

kostajh renamed this task from Account creation workflow is broken on mobile domain to Account creation attempt on mobile Wikipedia domain leads user to logged-out state on desktop Special:CentralLogin/complete.Apr 20 2023, 12:51 PM

This is probably T257852: CentralAuth edge login and autologin for some Wikimedia domains broken on mobile - local login succeeds but central login fails, presumably due to a mobile -> desktop redirect happening somewhere in the middle of the process and cookies getting lost because of that (the way mobile domain support is hacked onto MediaWiki isn't very reliable). This would affect account creations as well (they are followed by account creation just the same).

At the end, you are logged in on en.m.wikipedia.org but not on en.wikipedia.org. (If you aren't logged on at either, that would be an error in AuthManager, not in CentralAuth. That codebase doesn't change much, is heavily tested, and isn't affected much by browser behavior, so I wouldn't expect it to be the case.) Which of those domains your browser ends up on depends, I think, on how you arrived on the mobile domain (device autodetection vs. going to that domain directly vs. interacting with the mobile/desktop toggle).

T318138: Cannot manually log in on mobile Wikidata (real or test) is also the same issue. Not sure about T318138: Cannot manually log in on mobile Wikidata (real or test) which sounds the same but results in a different error (and it's reported on Beta so all bets are off anyway).

This is probably T257852: CentralAuth edge login and autologin for some Wikimedia domains broken on mobile - local login succeeds but central login fails, presumably due to a mobile -> desktop redirect happening somewhere in the middle of the process and cookies getting lost because of that (the way mobile domain support is hacked onto MediaWiki isn't very reliable). This would affect account creations as well (they are followed by account creation just the same).

Local login succeeds when I use WikimediaDebug to route traffic through a mwdebug server, but otherwise, I am logged out on both mobile and desktop domains.

This is probably T257852: CentralAuth edge login and autologin for some Wikimedia domains broken on mobile - local login succeeds but central login fails, presumably due to a mobile -> desktop redirect happening somewhere in the middle of the process and cookies getting lost because of that (the way mobile domain support is hacked onto MediaWiki isn't very reliable). This would affect account creations as well (they are followed by account creation just the same).

Local login succeeds when I use WikimediaDebug to route traffic through a mwdebug server, but otherwise, I am logged out on both mobile and desktop domains.

Which also makes me wonder if the datacenter switchover (and multi-DC plumbing more generally) is somehow implicated, because the mwdebug servers are using codfw, and my browser is showing that the GET requests are being served by eqiad. I don't know why this would be failing to work on mobile domain requests, though, and succeeding on desktop.

kostajh renamed this task from Account creation attempt on mobile Wikipedia domain leads user to logged-out state on desktop Special:CentralLogin/complete to Account creation attempt on mobile Wikipedia domain leads user to desktop Special:CentralLogin/complete, often in logged-out state.Apr 20 2023, 1:38 PM

Special:CentralAutoLogin is forced to go to the primary DC but Special:CentralLogin doesn't. I wonder if that's an intentional omission.

FWIW mwdebug servers are different in a number of ways (slower which might matter for race conditions, Varnish/ATS logic does not run, your request always goes to the same machine which might be relevant for things that use two layers of cache).

Special:CentralAutoLogin is forced to go to the primary DC but Special:CentralLogin doesn't. I wonder if that's an intentional omission.

Tagging SRE for awareness and also to consider the above comment. Should Special:CentralLogin be forced to the primary DC?

That file is mostly written by @tstarling so I recommend getting his opinion on this.

Tim is back tomorrow, is this still UBN? or can it be reduced to high?

kostajh lowered the priority of this task from Unbreak Now! to High.Apr 24 2023, 11:34 AM

Tim is back tomorrow, is this still UBN? or can it be reduced to high?

One could argue that this task meets the standards for "Unbreak Now!" per https://wikitech.wikimedia.org/wiki/Deployments/Holding_the_train#Issues_that_hold_the_train

Major feature regressions
Inability to login/logout/create account for a large portion of users

But since affected users are sometimes logged-in (though never on mobile) and the account is created, allowing the user to login again, I suppose we could justify dropping this to "High" priority.

@Tgr is going to take a time-boxed look into this issue (~2 days).

Problem is still happening, even after the datacenter switchover was completed, so I think we can rule that out. (Assuming that the SessionStorage service is also back to eqiad; I have not verified that.)

matmarex subscribed.

Looks resolved by @Tgr's changes in T257852.

I tested by registering https://en.m.wikipedia.org/wiki/User:Matma_Rex_test_2023-11-20, I have been redirected to https://en.m.wikipedia.org/w/index.php?title=Special:WelcomeSurvey&returnto=&returntoquery=&group=control&_welcomesurveytoken=<snip> afterwards.