Page MenuHomePhabricator

MassMessage not working for delivery to Suwikiquote or Mlwikiquote: "CAS update failed on user_touched"
Open, Needs TriagePublic

Description

An expected delivery to their village pumps was not made. I tried to reproduce at a sandbox, and confirmed.

To reproduce:

  1. {{target | user = Quiddity (WMF)/sandbox | site = su.wikiquote.org }}
  2. {{target | user = Quiddity (WMF)/sandbox | site = ml.wikiquote.org }}
  • Nothing arrives.

Impact

Message delivery is failing due to a system user conflict.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 18 2017, 11:02 PM

Hmm.

2017-07-18 22:55:57 [WW6R2wpAME4AAGJWDeEAAABN] mw1299 mlwikiquote 1.30.0-wmf.9 exception ERROR: [WW6R2wpAME4AAGJWDeEAAABN] [no req]   MWException from line 4057 of /srv/mediawiki/php-1.30.0-wmf.9/includes/user/User.php: CAS update failed on user_touched for user ID '2110' (read from replica); the version of the user to be saved is older than the current version. {"exception_id":"WW6R2wpAME4AAGJWDeEAAABN","caught_by":"mwe_handler"} 
[Exception MWException] (/srv/mediawiki/php-1.30.0-wmf.9/includes/user/User.php:4057) CAS update failed on user_touched for user ID '2110' (read from replica); the version of the user to be saved is older than the current version.
  #0 /srv/mediawiki/php-1.30.0-wmf.9/includes/user/User.php(737): User->saveSettings()
  #1 /srv/mediawiki/php-1.30.0-wmf.9/extensions/MassMessage/includes/MassMessage.php(45): User::newSystemUser(string, array)
  #2 /srv/mediawiki/php-1.30.0-wmf.9/extensions/MassMessage/includes/job/MassMessageJob.php(64): MassMessage::getMessengerUser()
  #3 /srv/mediawiki/php-1.30.0-wmf.9/extensions/MassMessage/includes/job/MassMessageJob.php(289): MassMessageJob->getUser()
  #4 /srv/mediawiki/php-1.30.0-wmf.9/extensions/MassMessage/includes/job/MassMessageJob.php(208): MassMessageJob->makeAPIRequest(array)
  #5 /srv/mediawiki/php-1.30.0-wmf.9/extensions/MassMessage/includes/job/MassMessageJob.php(186): MassMessageJob->editPage()
  #6 /srv/mediawiki/php-1.30.0-wmf.9/extensions/MassMessage/includes/job/MassMessageJob.php(41): MassMessageJob->sendMessage()
  #7 /srv/mediawiki/php-1.30.0-wmf.9/includes/jobqueue/JobRunner.php(293): MassMessageJob->run()
  #8 /srv/mediawiki/php-1.30.0-wmf.9/includes/jobqueue/JobRunner.php(193): JobRunner->executeJob(MassMessageJob, Wikimedia\Rdbms\LBFactoryMulti, BufferingStatsdDataFactory, integer)
  #9 /srv/mediawiki/rpc/RunJobs.php(47): JobRunner->run(array)
  #10 {main}
Aklapper renamed this task from MassMessage not working for delivery to Suwikiquote or Mlwikiquote to MassMessage not working for delivery to Suwikiquote or Mlwikiquote: "CAS update failed on user_touched".Jul 19 2017, 9:45 AM

Also see T124414

Elitre added a subscriber: Elitre.Jul 19 2017, 11:55 AM
Nikerabbit added a subscriber: Nikerabbit.

Also happening elsewhere, example from aawiki:

 exception.message	   	CAS update failed on user_touched. The version of the user to be saved is older than the current version.
t exception.trace	   	#0 /srv/mediawiki/php-1.35.0-wmf.37/includes/libs/rdbms/database/Database.php(4264): User->{closure}(Wikimedia\Rdbms\DatabaseMysqli, string)
#1 /srv/mediawiki/php-1.35.0-wmf.37/includes/libs/rdbms/database/DBConnRef.php(68): Wikimedia\Rdbms\Database->doAtomicSection(string, Closure)
#2 /srv/mediawiki/php-1.35.0-wmf.37/includes/libs/rdbms/database/DBConnRef.php(641): Wikimedia\Rdbms\DBConnRef->__call(string, array)
#3 /srv/mediawiki/php-1.35.0-wmf.37/includes/user/User.php(3523): Wikimedia\Rdbms\DBConnRef->doAtomicSection(string, Closure)
#4 /srv/mediawiki/php-1.35.0-wmf.37/includes/user/User.php(844): User->saveSettings()
#5 /srv/mediawiki/php-1.35.0-wmf.37/extensions/MassMessage/includes/MassMessage.php(44): User::newSystemUser(string, array)
#6 /srv/mediawiki/php-1.35.0-wmf.37/extensions/MassMessage/includes/Job/MassMessageJob.php(93): MediaWiki\MassMessage\MassMessage::getMessengerUser()
#7 /srv/mediawiki/php-1.35.0-wmf.37/extensions/MassMessage/includes/Job/MassMessageJob.php(368): MediaWiki\MassMessage\Job\MassMessageJob->getUser()
#8 /srv/mediawiki/php-1.35.0-wmf.37/extensions/MassMessage/includes/Job/MassMessageJob.php(260): MediaWiki\MassMessage\Job\MassMessageJob->makeAPIRequest(array)
#9 [internal function]: MediaWiki\MassMessage\Job\MassMessageJob->editPage(string)
#10 /srv/mediawiki/php-1.35.0-wmf.37/extensions/MassMessage/includes/Job/MassMessageJob.php(240): call_user_func(array, string)
#11 /srv/mediawiki/php-1.35.0-wmf.37/extensions/MassMessage/includes/Job/MassMessageJob.php(69): MediaWiki\MassMessage\Job\MassMessageJob->sendMessage()
#12 /srv/mediawiki/php-1.35.0-wmf.37/extensions/EventBus/includes/JobExecutor.php(79): MediaWiki\MassMessage\Job\MassMessageJob->run()
#13 /srv/mediawiki/rpc/RunSingleJob.php(76): MediaWiki\Extension\EventBus\JobExecutor->execute(array)
#14 {main}

The trace is similar to T124414: MassMessage Fatal error: CAS update failed on user_touched but that has been closed. This is still open, so hijacking it as both traces have MassMessage::getMessengerUser in them.

mmodell added a project: Editing-team.
mmodell edited projects, added Contributors-Team; removed Editing-team.
Krinkle updated the task description. (Show Details)Wed, Jul 22, 7:35 PM
JJMC89 changed the subtype of this task from "Task" to "Production Error".Thu, Jul 23, 3:45 AM
Tgr added a subscriber: Tgr.Fri, Jul 24, 9:38 AM

At least on aawiki, MediaWiki message delivery is not a system user: it has a real user token. (But also an empty email and password, which suggests it was a system user at some point... not sure how that came to be.)
That triggers the "steal" branch of User::newSystemUser, which tries to change the token and save the user, which results in a CAS error. Either the job is conflicting with itself, like in T124414 (but that seems unlikely: why would it run multiple instances for aawiki? and more importantly, at least one of the user updates would succeed, and we'd see a valid token) or maybe the logic in User::newSystemUser conflicts with something done in one of the revokeAccessForUser handlers (but that also doesn't seem to be the case: the only things that would run are providerChangeAuthenticationData in LocalPasswordPrimaryAuthenticationProvider and TemporaryPasswordPrimaryAuthenticationProvider and those don't change user_touched).

Tgr added a comment.Fri, Jul 24, 9:52 AM

Running User::newSystemUser manually worked fine on aawiki.