https://meta.wikimedia.org/wiki/Special:GlobalRenameProgress/Neurax%C4%B1s is currently stuck on Commons, listed as "In Progress" for more than 24 hours now. All other renames have completed successfully. https://meta.wikimedia.org/wiki/Special:CentralAuth/Neurax%C4%B1s doesn't display any message about a rename being in progress, while https://meta.wikimedia.org/wiki/Special:CentralAuth/Gautehuus shows the unattached account on Commons.
Description
Details
Related Objects
- Mentioned In
- T147850: GlobalRenameProgress: display status "failed" if the rename failed, instead of "in progress".
T147825: Global rename: Zhyar Merlin to Zhiar Merlin is stuck - Mentioned Here
- rECAU02bc47808976: use DB_REPLICA instead of deprecated DB_SLAVE
T145596: Renames getting stuck on mediawiki.org (Sept 13, 2016)
Event Timeline
Given that past MW trains have negatively affected in the resolution of stuck global renames, I think it's best to resolve this prior to installing new MW versions.
Some doc to fix it up is on T145596#2640418
[centralauth]> select * from renameuser_status group by ru_oldname, ru_newname; +------------+------------+-------------+------------+ | ru_oldname | ru_newname | ru_wiki | ru_status | +------------+------------+-------------+------------+ | Gautehuus | Neuraxıs | commonswiki | inprogress | +------------+------------+-------------+------------+ 1 row in set (0.00 sec)
I could not find any exception/error in the logs so it is a mystery.
I have manually triggered the job:
$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki Gautehuus Neuraxıs Using Céréales Killer as the renamer. from: Gautehuus to: Neuraxıs renamer: Céréales Killer movepages: 1 suppressredirects: reason: per [[m:Special:GlobalRenameQueue/request/27249|request]] Starting to run job... Done!
But the rename is still stuck and I still can't find any log/exception :(
https://quarry.wmflabs.org/query/12886 indeed shows the rename is still stuck. Maybe run the script again on terbium and see what it happens?
rECAU02bc47808976: use DB_REPLICA instead of deprecated DB_SLAVE The script was recently modified.
Moving to UBN since this task is blocking the MW train that's supposed to happen shortly.
I added the blocker since past MediaWiki trains worsened the situation of
stuck global renames (see previous tickets). If you think that this should
not be a blocker, then feel free to remove the parent task. Thank you for
your understanding.
It's not related to any new code and it affected a single user on a single wiki so far, so IMO not UBN and no reason to halt the train.
The two obvious problems are that LocalRenameJob skips users with an inprogress state and that it tries to log this in the non-existent rename channel. Will fix manually when I'm in the office.
In order for the rename jobs on the following wikis to have run, the rename must have completed successfully and it somehow rolled back the database changes later on.
I think these are probably the relevant log entries:
2016-09-28 17:46:54 [V@wBjApAIDgAAGhMK6EAAACG] mw1301 commonswiki 1.28.0-wmf.20 runJobs DEBUG: LocalRenameUserJob Global_rename_job from=Gautehuus to=Neuraxıs renamer=Céréales Killer reattach=array(107) movepages=1 suppressredirects= promotetoglobal= reason=per [[m:Special:GlobalRenameQueue/request/27249|request]] session={"ip":"REDACTED","headers":"array(...)","sessionId":"","userId":0} force= requestId=V@wBjApAIDgAAGhMK6EAAACG (uuid=202a64657a9640158650458bd21fcf50,timestamp=1475084804,QueuePartition=rdb1-6379) STARTING 2016-09-28 17:46:54 [V@wBjApAIDgAAGhMK6EAAACG] mw1301 commonswiki 1.28.0-wmf.20 runJobs INFO: LocalRenameUserJob Global_rename_job from=Gautehuus to=Neuraxıs renamer=Céréales Killer reattach=array(107) movepages=1 suppressredirects= promotetoglobal= reason=per [[m:Special:GlobalRenameQueue/request/27249|request]] session={"ip":"REDACTED","headers":"array(...)","sessionId":"","userId":0} force= requestId=V@wBjApAIDgAAGhMK6EAAACG (uuid=202a64657a9640158650458bd21fcf50,timestamp=1475084804,QueuePartition=rdb1-6379) COMMIT ENQUEUED [426ms of writes] 2016-09-28 17:47:24 [V@wBjApAIDgAAGhMK6EAAACG] mw1301 commonswiki 1.28.0-wmf.20 runJobs ERROR: LocalRenameUserJob Global_rename_job from=Gautehuus to=Neuraxıs renamer=Céréales Killer reattach=array(107) movepages=1 suppressredirects= promotetoglobal= reason=per [[m:Special:GlobalRenameQueue/request/27249|request]] session={"ip":"REDACTED","headers":"array(...)","sessionId":"","userId":0} force= requestId=V@wBjApAIDgAAGhMK6EAAACG (uuid=202a64657a9640158650458bd21fcf50,timestamp=1475084804,QueuePartition=rdb1-6379) t=30607 error=DBError: Timed out waiting on commit queue.
There's code in there that tries to serialize the commits of all jobs that took longer than 0.1s, and apparently just rolls the job back without rescheduling if it can't grab the serialization lock within 30 seconds.
@aaron might be able to tell us more about the "COMMIT ENQUEUED" / "Timed out waiting on commit queue" situation.
After pouring through the rename code, I didn't see any way in which re-running a halfway aborted rename could cause problems, so I just changed the status to failed and re-ran the script. The user account should be fixed now.
Possible follow-ups:
- fix the underlying error
- fixStuckGlobalRename.php should treat inprogress status as failed
- the rename log channel should go somewhere
- "https://meta.wikimedia.org/wiki/Special:CentralAuth/Neurax%C4%B1s doesn't display any message about a rename being in progress" - should that be changed? (there is some exception-handling code in Special:CentralAuth which did that, but that's not called anymore since CentralAuthUser stopped throwing exceptions on unattached accounts)
Change 314218 had a related patch set uploaded (by Gergő Tisza):
Add ignorestatus option for fixing stuck renames
Change 314219 had a related patch set uploaded (by Gergő Tisza):
Set failed rename queue status on late errors
Change 314218 merged by jenkins-bot:
Add ignorestatus option for fixing stuck renames
First three are done, I'll call this fixed. If someone feels strongly about the last, feel free to file a task and assign it to me.
Change 315364 had a related patch set uploaded (by Gergő Tisza):
Add ignorestatus option for fixing stuck renames
Change 315364 merged by jenkins-bot:
Add ignorestatus option for fixing stuck renames
Mentioned in SAL (#wikimedia-operations) [2016-10-11T23:41:45Z] <ebernhardson@mira> Synchronized php-1.28.0-wmf.21/extensions/CentralAuth/: SWAT T147029 Add ignorestatus option for fixing stuck renames (duration: 00m 53s)