Page MenuHomePhabricator

CentralAuth login attempt gives "No active login attempt is in progress for your session"
Closed, ResolvedPublic

Description

I'm using MediaWiki 1.26 and CentralAuth on a wiki farm, and want to upgrade to 1.27, but when logging in (on the new 1.27 wiki - all extensions are 1.27 too), CentralAuth (Special:CentralLogin/complete?token=<token>) gives this error:

Central user log in
No active login attempt is in progress for your session.

The error seems to come from $attempt = $request->getSessionData( $skey ); if ( !isset( $attempt['secret'] ) ), $attempt is NULL here.

Why can $attempt be null?

Event Timeline

	public function getSessionData( $key ) {
		return $this->getSession()->get( $key );
	}

and

	public function get( $key, $default = null ) {
		$data = &$this->backend->getData();
		return array_key_exists( $key, $data ) ? $data[$key] : $default;
	}

Do a get with a default of null, don't find it, so return null... So $attempt becomes null

Possibly your farm is not sharing sessions, which is a requirement for CentralAuth. Session storage configuration has changed for 1.27, you might want to review that.

Possibly your farm is not sharing sessions, which is a requirement for CentralAuth. Session storage configuration has changed for 1.27, you might want to review that.

It could be possible that my configuration is not up-to-date. Like Wikimedia, we use redis for sessions (production 1.26 config):

$wgObjectCaches['redis'] = array(
	'class' => 'RedisBagOStuff',
	'servers' => array( '<IP>' ),
	'password' => $wmgRedisPassword,
);
$wgMainCacheType = 'redis';
$wgSessionCacheType = 'redis';
$wgSessionsInObjectCache = true;
$wgMessageCacheType = CACHE_NONE;
$wgParserCacheType = CACHE_DB;
$wgLanguageConverterCacheType = CACHE_DB;

I am aware $wgSessionsInObjectCache is now true by default, but that should not introduce an issue, right? I compared our configuration with Wikimedia's, and was not able to find anything that I missed.

$wgSessionsInObjectCache is ignored, session data is always stored in the object cache. The config seems fine. (Also, my previous comment was wrong - the line of code you indicated checks the normal session, not the shared CentralAuth session. It would be strange if normal login worked but CA did not.) You'll probably have to debug in more detail what happens with the session (is the session cookie correctly set? does the record with that key persist in Redis?)

NDKilla added a subscriber: NDKilla.

Note to everyone this issue was resolved (although 'invalid' seems more appropriate) on Miraheze simply by upgrading our CentralAuth submodule in our REL1_27 MediaWiki branch. I'm not sure exactly what the issue is, but all our wikis are now running REL1_27 with updated extensions.

https://github.com/miraheze/mediawiki/commit/3f6cdce98e3b6fc83949762b7162aef3d6d4236f

The same issue is occurring again at a high rate, and is causing problems. Also, there are users reporting they are silently being logged out of the site. I'm not sure how much they are related to each other, but I spotted this in the debug logs:

[session] Session "[50]CentralAuthSessionProvider<-:2:Southparkfan>[REDACTED]": Metadata merge failed: [Exception MediaWiki\Session\MetadataMergeException(
 /srv/mediawiki/w/includes/session/SessionProvider.php:195) Key "CentralAuthSource" changed]
#0 /srv/mediawiki/w/includes/session/SessionManager.php(629): MediaWiki\Session\SessionProvider->mergeMetadata(array, array)
#1 /srv/mediawiki/w/includes/session/SessionManager.php(498): MediaWiki\Session\SessionManager->loadSessionInfoFromStore(MediaWiki\Session\SessionInfo, WebRequest)
#2 /srv/mediawiki/w/includes/session/SessionManager.php(182): MediaWiki\Session\SessionManager->getSessionInfoForRequest(WebRequest)
#3 /srv/mediawiki/w/includes/WebRequest.php(700): MediaWiki\Session\SessionManager->getSessionForRequest(WebRequest)
#4 /srv/mediawiki/w/includes/session/SessionManager.php(121): WebRequest->getSession()
#5 /srv/mediawiki/w/includes/Setup.php(747): MediaWiki\Session\SessionManager::getGlobalSession()
#6 /srv/mediawiki/w/includes/WebStart.php(137): require_once(string)
#7 /srv/mediawiki/w/index.php(40): require(string)
#8 {main}

We're running CentralAuth REL1_27 (with the latest commits), MediaWiki 1.27.1 and PHP 5.6.27. Redis configuration:

$wgObjectCaches['redis'] = array(
	'class' => 'RedisBagOStuff',
	'servers' => array( '<ip>:6379' ),
	'password' => $wmgRedisPassword,
);
$wgMainCacheType = 'redis';
$wgSessionCacheType = 'redis';
$wgSessionsInObjectCache = true;
$wgMessageCacheType = CACHE_NONE;
$wgParserCacheType = CACHE_DB;
$wgLanguageConverterCacheType = CACHE_DB;

Hopefully I was able to provide you more information this time.

@Aklapper The problem I reported could also be related to this. Any ideas on what the error could be and how it could be fixed?

I also point out we are now running REL1_28 , 1.28 and PHP 5.6.29

Just dropping by that one of the users of Polish Wikipedia recently reported a similar problem when logging in. No idea what it means though, will try to investigate.

Changing to "High" as this issue is affecting a lot of users outside of upstream, and as seen above some users on WMF wikis.

If you disagree with this change please revert back, as I'm not really sure if this is appropriate.

This has been the case for nearly a year so I don't see a sudden urgency (which translates to "high priority")?
Links to reports by non-upstream users very welcome.

Note that the bug description basically amounts to "central login is not working". (Which is fair; CentralAuth is super complicated, and finding out exactly what is failing is probably the bigger part of the work. But saying "wiki X is also affected by this" as if there was a single issue affecting all WMF and non-WMF wikis, and everything could be fixed with the same debugging effort, is potentially misleading.) I don't think this task will move forward unless there is a wiki owner who can reliably reproduce the issue and is willing to do a serious amount of debugging.

Logging that User:jfsamper seems to be experiencing this issue so that I don't forget. There do not appear to be are no global locks or blocks affecting them. The user will hopefully be around tomorrow on #wikimedia-tech for further debugging effort.

For what it's worth, I can still confirm it is still happening to users on the farm.

It now looks like T169261 seems more plausible for my particular report.

I just got this error on a WMF wiki, MediaWiki.org. This is a serious issue that is obviously happening on WMF wikis as well as others and should be addressed.

@Reception123: Please see and read T141482#3284142. In genreal, "bug XYZ should get fixed!" / "me too" / "+1" comments do not bring a task closer to resolution.

Zppix moved this task from Backlog to Monitor on the User-Zppix board.
Zppix added a subscriber: Zppix.

I've done quite a bit of debugging downstream, and found that the issue may be T169261, but that wouldn't make sense, because we are using a version of CentralAuth that includes the fix that resolved that task. Anyway, here's the debugging I got:

Intended behavior (as seen on successful login to account Void):
Sets a minimum of ten cookies.

  • Three under login.miraheze.org
    • loginwikiUserID
    • loginwikiUserName
    • loginwiki_session
  • Three under meta.miraheze.org
    • metawikiUserID
    • metawikiUserName
    • metawiki_session
    • (There are others, but I believe they are irrelevant)
  • Four under .miraheze.org
    • centralauth_Session
    • centralauth_Token
    • centralauth_User
    • forceHTTPS (irrelevant)

Failed behavior (as seen on broken login for Voidwalker):

  • login.miraheze.org
    • loginwikiUserID
    • loginwikiUserName
    • loginwiki_session
  • meta.miraheze.org
    • loginnotify_prevlogins
    • metawikiUserName

A temporary login also generates:

  • meta.miraheze.org
    • UseCDNCache
    • UseDC

But those temporary cookies last less than a minute, and when they expire, so does the session. <-- Unsure of the accuracy of this statement

Following that, I took a look at my network headers and found this:

set-cookie: centralauth_Token=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; domain=.miraheze.org; secure; HttpOnly
set-cookie: centralauth_Session=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; domain=.miraheze.org; secure; HttpOnly
set-cookie: metawikiUserID=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; secure; HttpOnly
set-cookie: centralauth_User=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; domain=.miraheze.org; secure; HttpOnly
set-cookie: metawiki_session=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; secure; HttpOnly
set-cookie: forceHTTPS=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; HttpOnly
set-cookie: forceHTTPS=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; domain=.miraheze.org; HttpOnly

which mimics T168858, but again, we should have the fix for this.

Attempting to invalidate the session by resetting the password (via email) failed to both login and change the password.

Also, I just want to remark that I am concerned that my own debugging is not for the same issue as the original that we encountered, as that one could be fixed or bypassed by the user. So far, the account I triggered this on has remained inaccessible for over 24h.

TheVoidwalker assigned this task to Paladox.

Handled with downstream config fixes. Not to mention, if there is another issue, it appears to be next to impossible to debug in the current situation (too intermittent/unreliable).