Page MenuHomePhabricator

Global rename of Erik_Fastman to Glorious_Engine stuck "in progress" since 28th February on wikidatawiki
Closed, ResolvedPublic

Description

This global rename is stuck "in progress" as of 28th Feb. on wikidatawiki. Edits there are not that high, but overall the user had more than 100,000 global edits.

Unbreak Now because of the fact that the user is unable to use their account until the rename fully finishes, in accordance to the Wikimedia-Site-requests project triaging rules. Feel free to decrease to High if you feel it is more appropriate though.

Addendum: issue has been raised at idwiki.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 2 2018, 11:53 AM
MarcoAurelio triaged this task as Unbreak Now! priority.Mar 2 2018, 11:53 AM

Per above.

Restricted Application added subscribers: Liuxinyu970226, TerraCodes, Dereckson. · View Herald TranscriptMar 2 2018, 11:53 AM

I think the global renamers must take care more, and not change any user more than 100,000 edits without a supervision of sysadmin!

MarcoAurelio added a comment.EditedMar 2 2018, 11:58 AM

@alanajjar I already told that very day to the renamer to be more careful next time. No reply though.

in accordance to the Wikimedia-Site-requests project triaging rules.

Where to find that?

MarcoAurelio updated the task description. (Show Details)Mar 2 2018, 12:44 PM
Samtar added a subscriber: Samtar.Mar 3 2018, 1:00 PM

Hello there. Anybody could, please, see this? There's documentation at https://wikitech.wikimedia.org/wiki/Stuck_global_renames - Thanks.

Samtar added a comment.Mar 4 2018, 4:56 PM

Following the advice of the above linked document, I have checked logstash

There are no exceptions in logstash since the rename began, the last channel: Renameuser entry was from 2018-02-28T15:02:01 and was GlobalRename: Starting rename of Erik Fastman to Glorious Engine on wawiki - I can't see any mention of wikidata

I'm going to give this a try...

Mentioned in SAL (#wikimedia-operations) [2018-03-04T18:05:29Z] <musikanimal> T188721 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --logwiki=metawiki 'Erik Fastman' 'Glorious Engine'

I followed the instructions, and the script had the output "Starting to run job... Done!". I see an entry under channel:CentralAuthRename in logstash for wikidatawiki (where it got stuck) that says skipping duplicate rename from {user}. Meanwhile showJobs.php --wiki=wikidatawiki --type RenameUserJob (and LocalRenameUserJob) still reports 0, and same for wikimania2014wiki, which is the next wiki in the list.

Deferring to someone who has a better idea of what they're doing... Sorry I couldn't figure it out!

Thanks for trying @MusikAnimal. Maybe @Legoktm or @Tgr could take a look here? Thanks.

Tgr added a subscriber: aaron.Mar 4 2018, 8:13 PM

Seems like a query timeout:

Expectation (readQueryTime <= 30) by JobRunner::run not met (actual: 83.462260007858):
query-m: SELECT rev_timestamp FROM `revision` WHERE rev_user_text = 'X' ORDER BY rev_timestamp ASC  [TRX#a8aa24]
#0 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/TransactionProfiler.php(223): Wikimedia\Rdbms\TransactionProfiler->reportExpectationViolated()
#1 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(1055): Wikimedia\Rdbms\TransactionProfiler->recordQueryCompletion()
#2 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(953): Wikimedia\Rdbms\Database->doProfiledQuery()
#3 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(1380): Wikimedia\Rdbms\Database->query()
#4 /srv/mediawiki/php-1.31.0-wmf.22/extensions/Renameuser/RenameuserSQL.php(262): Wikimedia\Rdbms\Database->select()
#5 /srv/mediawiki/php-1.31.0-wmf.22/extensions/CentralAuth/includes/LocalRenameJob/LocalRenameUserJob.php(86): RenameuserSQL->rename()
#6 /srv/mediawiki/php-1.31.0-wmf.22/extensions/CentralAuth/includes/LocalRenameJob/LocalRenameJob.php(63): LocalRenameUserJob->doRun()
#7 /srv/mediawiki/php-1.31.0-wmf.22/includes/jobqueue/JobRunner.php(294): LocalRenameJob->run()
#8 /srv/mediawiki/php-1.31.0-wmf.22/includes/jobqueue/JobRunner.php(193): JobRunner->executeJob()
#9 /srv/mediawiki/rpc/RunJobs.php(47): JobRunner->run()
#10 {main}

(log entry)
This fails the job, which calls rollbackMasterChanges, which dies with

RuntimeException from line 776 of /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php: Transaction callbacks still pending.
#0 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/DatabaseMysqlBase.php(129): Wikimedia\Rdbms\Database->close()
#1 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(3257): Wikimedia\Rdbms\DatabaseMysqlBase->open(string, string, string, string)
#2 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(963): Wikimedia\Rdbms\Database->reconnect()
#3 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(3139): Wikimedia\Rdbms\Database->query(string, string, boolean)
#4 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(3105): Wikimedia\Rdbms\Database->doRollback(string)
#5 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1376): Wikimedia\Rdbms\Database->rollback(string, string)
#6 [internal function]: Closure$Wikimedia\Rdbms\LoadBalancer::rollbackMasterChanges(Wikimedia\Rdbms\DatabaseMysqli)
#7 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1591): call_user_func_array(Closure$Wikimedia\Rdbms\LoadBalancer::rollbackMasterChanges;4899, array)
#8 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1382): Wikimedia\Rdbms\LoadBalancer->forEachOpenMasterConnection(Closure$Wikimedia\Rdbms\LoadBalancer::rollbackMasterChanges;4899)
#9 [internal function]: Wikimedia\Rdbms\LoadBalancer->rollbackMasterChanges(string)
#10 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/lbfactory/LBFactory.php(184): call_user_func_array(array, array)
#11 [internal function]: Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod(Wikimedia\Rdbms\LoadBalancer, string, array)
#12 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/lbfactory/LBFactoryMulti.php(425): call_user_func_array(Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod;4824, array)
#13 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/lbfactory/LBFactory.php(187): Wikimedia\Rdbms\LBFactoryMulti->forEachLB(Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod;4824, array)
#14 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/lbfactory/LBFactory.php(247): Wikimedia\Rdbms\LBFactory->forEachLBCallMethod(string, array)
#15 /srv/mediawiki/php-1.31.0-wmf.22/extensions/CentralAuth/includes/LocalRenameJob/LocalRenameJob.php(67): Wikimedia\Rdbms\LBFactory->rollbackMasterChanges(string)
#16 /srv/mediawiki/php-1.31.0-wmf.22/includes/jobqueue/JobRunner.php(294): LocalRenameJob->run()
#17 /srv/mediawiki/php-1.31.0-wmf.22/includes/jobqueue/JobRunner.php(193): JobRunner->executeJob(LocalRenameUserJob, Wikimedia\Rdbms\LBFactoryMulti, BufferingStatsdDataFactory, integer)
#18 /srv/mediawiki/rpc/RunJobs.php(47): JobRunner->run(array)
#19 {main}

(log event)
so an exception is thrown from the exception handler and the catch block cannot set the rename status to failed (which would allow for retries).

This is pretty broken on the core DB handling level. @aaron any idea how to fix it?

Tgr added a comment.Mar 4 2018, 8:13 PM

As for this specific rename, the user table still has the old field so looks like the rename didn't start (or at least the first transaction could not be committed) and can just be force-retried.

Mentioned in SAL (#wikimedia-operations) [2018-03-04T20:16:38Z] <tgr> T188721 ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --ignorestatus --logwiki=metawiki 'Erik Fastman' 'Glorious Engine'

Tgr added a comment.Mar 4 2018, 8:39 PM
tgr@terbium:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --ignorestatus --logwiki=metawiki 'Erik Fastman' 'Glorious Engine'
Using Maxim as the renamer.
from: Erik Fastman
to: Glorious Engine
renamer: Maxim
movepages: 1
suppressredirects: 
reason: per [[m:Special:GlobalRenameQueue/request/40591|request]]
ignorestatus: 1

Starting to run job...
Done!
aaron added a comment.Mar 4 2018, 8:49 PM

Reconnecting in the case of rollback is an corner case, since normally just closing like that should error out. If ROLLBACK fails due to connection loss, there really isn't a need to reconnect, since everything should have rolled back on connection loss in the first place. Some sort flag to disable reconnection during rollback would be needed.

Tgr closed this task as Resolved.Mar 4 2018, 8:51 PM
Tgr claimed this task.

The rename is done; updated the help page to hopefully be more helpful in the future; filed T188875: Unexpected errors when ROLLBACK fails due to the DB server having "gone away" about the DB-level issue.

MarcoAurelio moved this task from Backlog to Closed on the GlobalRename board.Sep 1 2018, 12:16 PM