Page MenuHomePhabricator

edit.php and rollbackEdits.php scripts failing with MWException "CAS update failed on user_touched"
Closed, ResolvedPublic

Description

Happens always:

2018-06-15 14:55:22 web1.translatewiki.net translatewiki_net-bw_: [8df56189ced613d5ea87879e] [no req]   MWException from line 4203 of /srv/mediawiki/workdir/includes/user/User.php: CAS update failed on user_touched for user ID '294' (read from replica); the version of the user to be saved is older than the current version.
#0 /srv/mediawiki/workdir/includes/libs/rdbms/database/Database.php(3726): User->{closure}(Wikimedia\Rdbms\DatabaseMysqli, string)
#1 /srv/mediawiki/workdir/includes/user/User.php(4217): Wikimedia\Rdbms\Database->doAtomicSection(string, Closure)
#2 /srv/mediawiki/workdir/includes/user/User.php(864): User->saveSettings()
#3 /srv/mediawiki/workdir/maintenance/rollbackEdits.php(79): User::newSystemUser(string, array)
#4 /srv/mediawiki/workdir/maintenance/doMaintenance.php(102): RollbackEdits->execute()
#5 /srv/mediawiki/workdir/maintenance/rollbackEdits.php(121): require_once(string)
#6 {main}

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 15 2018, 2:56 PM
Nemo_bis triaged this task as Medium priority.Jun 15 2018, 3:58 PM
Peachey88 updated the task description. (Show Details)Jun 15 2018, 10:19 PM
Vvjjkkii renamed this task from rollbackEdits.php broken to ztaaaaaaaa.Jul 1 2018, 1:03 AM
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
JJMC89 renamed this task from ztaaaaaaaa to rollbackEdits.php broken.Jul 1 2018, 1:36 AM
JJMC89 lowered the priority of this task from High to Medium.
JJMC89 updated the task description. (Show Details)
JJMC89 added a subscriber: Aklapper.

This affects all maintenance scripts (including deleteBatch.php and edit.php) which try to create system users.

Krinkle renamed this task from rollbackEdits.php broken to rollbackEdits.php fails with MWException "CAS update failed on user_touched".Jul 14 2018, 4:03 AM
Krinkle renamed this task from rollbackEdits.php fails with MWException "CAS update failed on user_touched" to edit.php and rollbackEdits.php scripts failing with MWException "CAS update failed on user_touched".Jul 14 2018, 4:07 AM
Krinkle added subscribers: Jidanni, Reedy.

@Nikerabbit I assume that with all the other CAS tasks being filed and closed, this is probably solved as well. Could you confirm that?

Anomie added a subscriber: Anomie.Jul 17 2018, 1:42 AM

I poked at this and couldn't manage to reproduce it locally, both with an existing maintenance user and with needing the creation of the maintenance user.

Still happens in 22dd581f1:

php maintenance/edit.php Ploppa
[60c1dfcf693146bf2853e8bc] [no req]   MWException from line 4207 of /www/dev.translatewiki.net/docroot/w/includes/user/User.php: CAS update failed on user_touched for user ID '294' (read from replica); the version of the user to be saved is older than the current version.
Backtrace:
#0 /www/dev.translatewiki.net/docroot/w/includes/libs/rdbms/database/Database.php(3731): User->{closure}(Wikimedia\Rdbms\DatabaseMysqli, string)
#1 /www/dev.translatewiki.net/docroot/w/includes/user/User.php(4221): Wikimedia\Rdbms\Database->doAtomicSection(string, Closure)
#2 /www/dev.translatewiki.net/docroot/w/includes/user/User.php(868): User->saveSettings()
#3 /www/dev.translatewiki.net/docroot/w/maintenance/edit.php(57): User::newSystemUser(string, array)
#4 /www/dev.translatewiki.net/docroot/w/maintenance/doMaintenance.php(94): EditCLI->execute()
#5 /www/dev.translatewiki.net/docroot/w/maintenance/edit.php(107): require_once(string)
#6 {main}
Anomie added a comment.Aug 9 2018, 1:15 PM

It looks like you may have to do some debugging yourself to determine what exactly is making this happen. My guess at this point that it's an interaction with some extension you're running (or some locally-configured hook function) that is somehow changing user_touched in between where User::newSystemUser() loads the row and where it tries to save the changed settings.

Any suggestions where or what to look for? Anything updating user_touched?

Anomie added a comment.Aug 9 2018, 1:39 PM

Basically, although it's probably going to turn out to be something creating another User object and calling ->saveSettings() or the like on it.

Personally, I'd start by inserting echo "XXX1: " . $dbw->selectField( 'user', 'user_touched', 'user_id = 294' ) . "\n"; lines in User::newSystemUser() (incrementing the "XXX1" for each one) between each call to other methods to find which is causing it to change. And then repeating as necessary inside the culprit method. But there might be better ways if you're using a real IDE.

It was happening inside User::saveSettings. Curiously, just adding few debugging lines in that function seemed to make the problem go away temporarily (it's back after removing those debugging lines)

Ah, the problem did seem to go away, because I performed a query before if ( !$dbw->affectedRows() ) . Also made ir print queries:

UPDATE  `bw_user` SET user_name = 'Maintenance script',user_real_name = '',user_email = '',user_email_authenticated = NULL,user_touched = '20180814133050',user_token = '*** INVALID ***',user_email_token = NULL,user_email_token_expires = NULL WHERE user_id = '294' AND user_touched = '20180814133045'

This fails because:

MariaDB [dev_translatewiki_net]> select user_id, user_touched from bw_user where user_id=294\G
*************************** 1. row ***************************
     user_id: 294
user_touched: 
1 row in set (0.00 sec)

My theory would be that somewhere in the code missing user_touched is being interpreted as current timestamp, which causes the mismatch and the update command to fail.

MariaDB [dev_translatewiki_net]> select user_id, user_touched from bw_user where user_id=294\G
*************************** 1. row ***************************
     user_id: 294
user_touched: 
1 row in set (0.00 sec)

Hmm. The User class expects that user_touched is always a timestamp. See https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/c1d7412ce2b7536a8a0e9ad1c7409fdf0a98502a/includes/user/User.php#1538

Did some bug result in an empty user_touched for that user, or did it get "manually"[1] added to your database at some point?

[1]: Either actually manually, or hacked up by some old maintenance script

Did some bug result in an empty user_touched for that user, or did it get "manually"[1] added to your database at some point?

[1]: Either actually manually, or hacked up by some old maintenance script

That "user" #294 is indeed https://translatewiki.net/wiki/Special:Contributions/Maintenance_script

I have three such accounts at translatewiki.net where user_touched is not set. Two of them are maintenance scripts. I think this should be handled in core, as other installations might encounter the same issue.

Updated trace:

MWException from line 3965 of /srv/mediawiki/workdir/includes/user/User.php: CAS update failed on user_touched. The version of the user to be saved is older than the current version.
#0 /srv/mediawiki/workdir/includes/libs/rdbms/database/Database.php(3992): User->{closure}(Object(Wikimedia\Rdbms\DatabaseMysqli), 'User::saveSetti...')
#1 /srv/mediawiki/workdir/includes/libs/rdbms/database/DBConnRef.php(68): Wikimedia\Rdbms\Database->doAtomicSection('User::saveSetti...', Object(Closure))
#2 /srv/mediawiki/workdir/includes/libs/rdbms/database/DBConnRef.php(638): Wikimedia\Rdbms\DBConnRef->__call('doAtomicSection', Array)
#3 /srv/mediawiki/workdir/includes/user/User.php(3976): Wikimedia\Rdbms\DBConnRef->doAtomicSection('User::saveSetti...', Object(Closure))
#4 /srv/mediawiki/workdir/includes/user/User.php(795): User->saveSettings()
#5 /srv/mediawiki/workdir/maintenance/edit.php(63): User::newSystemUser('Maintenance scr...', Array)
#6 /srv/mediawiki/workdir/maintenance/doMaintenance.php(107): EditCLI->execute()
#7 /srv/mediawiki/workdir/maintenance/edit.php(132): require_once('/srv/mediawiki/...')
#8 {main}
[10e43c3b08575a3ee48f6310] [no req]   Wikimedia\Rdbms\DBTransactionStateError from line 1433 of /srv/mediawiki/workdir/includes/libs/rdbms/database/Database.php: Cannot execute query from Wikimedia\Rdbms\Database::ping while transaction status is ERROR
Backtrace:
#0 /srv/mediawiki/workdir/includes/libs/rdbms/database/Database.php(1159): Wikimedia\Rdbms\Database->assertQueryIsCurrentlyAllowed(string, string)
#1 /srv/mediawiki/workdir/includes/libs/rdbms/database/Database.php(4322): Wikimedia\Rdbms\Database->query(string, string, integer)
#2 /srv/mediawiki/workdir/includes/libs/rdbms/database/Database.php(703): Wikimedia\Rdbms\Database->ping(NULL)
#3 /srv/mediawiki/workdir/includes/libs/rdbms/database/Database.php(693): Wikimedia\Rdbms\Database->pingAndCalculateLastTrxApplyTime()
#4 /srv/mediawiki/workdir/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1640): Wikimedia\Rdbms\Database->pendingWriteQueryDuration(string)
#5 /srv/mediawiki/workdir/includes/libs/rdbms/loadbalancer/LoadBalancer.php(2144): Wikimedia\Rdbms\LoadBalancer->Wikimedia\Rdbms\{closure}(Wikimedia\Rdbms\DatabaseMysqli)
#6 /srv/mediawiki/workdir/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1656): Wikimedia\Rdbms\LoadBalancer->forEachOpenMasterConnection(Closure)
#7 /srv/mediawiki/workdir/includes/libs/rdbms/lbfactory/LBFactory.php(208): Wikimedia\Rdbms\LoadBalancer->approveMasterChanges(array, string, integer)
#8 /srv/mediawiki/workdir/includes/libs/rdbms/lbfactory/LBFactorySimple.php(125): Wikimedia\Rdbms\LBFactory->Wikimedia\Rdbms\{closure}(Wikimedia\Rdbms\LoadBalancer, string, array)
#9 /srv/mediawiki/workdir/includes/libs/rdbms/lbfactory/LBFactory.php(210): Wikimedia\Rdbms\LBFactorySimple->forEachLB(Closure, array)
#10 /srv/mediawiki/workdir/includes/libs/rdbms/lbfactory/LBFactory.php(269): Wikimedia\Rdbms\LBFactory->forEachLBCallMethod(string, array)
#11 /srv/mediawiki/workdir/maintenance/doMaintenance.php(132): Wikimedia\Rdbms\LBFactory->commitMasterChanges(string)
#12 /srv/mediawiki/workdir/maintenance/edit.php(132): require_once(string)
#13 {main}
Anomie added a comment.Nov 6 2019, 7:30 PM

The quick fix here would be to set user_touched to a timestamp for those three users, on the assumption that it's some old DB corruption that has been fixed at some point, then see if the problem reoccurs.

A more involved fix would be to have User somehow accept user_touched being empty. So far there's no evidence for this being a problem outside of these three users on translatewiki though, so that may be overkill.

Anomie closed this task as Resolved.Nov 7 2019, 5:14 PM

Ok, let's mark this as "Resolved" for now. If the problem shows up again, please reopen.