Page MenuHomePhabricator

Special:GlobalRenameProgress/MB is stuck
Closed, ResolvedPublic

Description

https://meta.wikimedia.org/wiki/Special:GlobalRenameProgress/MB has been stuck for nearly four days now. The rename on enwiki is stuck at "In progress" but doesn't seem to actually be progressing. All other wikis have been renamed successfully.

Event Timeline

MarcoAurelio triaged this task as Unbreak Now! priority.Aug 2 2016, 7:06 AM

All three job retry attempts failed with a lock wait timeout (twice when changing usernames in page revisions, once on SELECT wl_user, wl_notificationtimestamp FROM watchlist WHERE wl_namespace = '2' AND wl_title = 'Mb66w' FOR UPDATE - how on Earth can you get lock contention on that?).

All three job retry attempts failed with a lock wait timeout (twice when changing usernames in page revisions, once on SELECT wl_user, wl_notificationtimestamp FROM watchlist WHERE wl_namespace = '2' AND wl_title = 'Mb66w' FOR UPDATE - how on Earth can you get lock contention on that?).

No idea. However it'd be good if in these cases the GlobalRenameProgress table showed a "failed" status instead of "in progress", which ain't true. Thanks.

Failed jobs are retried by the job queue; I'm not sure if there is a way for a job to tell that it has failed for the last time. Maybe renameuser_status could get a failure count column?

@aaron, any thoughts? Global rename cannot be made atomic (as it consists of a bunch of jobs running on a bunch of different wikis), can we increase the lock timeout somehow?

Maybe we can change the puppet config to bump the retry count for rename related jobs? E.g. in modules/mediawiki/templates/jobrunner/jobrunner.conf.erb .

Change 302650 had a related patch set uploaded (by Gergő Tisza):
Increase retries for rename jobs

https://gerrit.wikimedia.org/r/302650

This task is set as UBN! Is that correct @Tgr / @aaron ? If so, will the above patch from @Tgr fix the issue and what is the timeline for deploying?

Tgr lowered the priority of this task from Unbreak Now! to Medium.Aug 3 2016, 11:26 PM

Nope, a single case of failed rename in two weeks is not an emergency. This is a puppet patch so I can get it deployed in tomorrow's SWAT (or someone can just go on and merge it I guess). I'm working on fixing the affected account.

The global rename seemed to be finished so I reattached the enwiki account and deleted the status row. MB should be OK now.

MarcoAurelio assigned this task to Tgr.

I'm closing this since the issue with this rename is done and the task is specific to this. Should you disagree, feel free to reopen :)

Change 302650 merged by Filippo Giunchedi:
Increase retries for rename jobs

https://gerrit.wikimedia.org/r/302650