Fully populate local_user_id and global_user_id fields in production
Closed, ResolvedPublic3 Story Points

Description

Run the T142503: Create a maintenance script for populating the local_user_id and global_user_id fields in the centralauth localuser table script in production.

Make sure to list this on the deployment calendar and do a !log command in IRC when it's run.

We'll need to tell the stewards and global renamers to not rename users while this script is running. Let's take a 24 hour window for this and if it takes longer than 24 hours, we'll stop it and resume later.

There are a very large number of changes, so older changes are hidden. Show Older Changes

@Johan: Have we alerted the stewards and global renamers about this yet?

I'm going to assume we haven't. @Niharika: Can you pick a 24 hour block some time this week and tell Johan to warn the stewards and global renamers not to rename anyone during that time? I'm going to leave this up to you and Johan to coordinate.

@kaldari I am fairly sure this script will run at least 3-4 days. I'm thinking of asking for a week long period for stewards and renamers to not rename to be on the safe side.

@Niharika: Sounds reasonable.

Added to deployments calendar: https://wikitech.wikimedia.org/wiki/Deployments#Week_of_November_14th

@Johan, I'm thinking of blocking the entire next week (midnight of 20th to midnight of 27th November, UST timings) for no renaming. What do you think?

Johan added a comment.Nov 16 2016, 9:39 AM

I'll talk to them.

This comment was removed by Johan.

@Johan, yes, UTC, sorry (typo).

And just to be very clear: with "midnight of 20th to midnight of 27th November", do you mean 00:00 both these days (start of November 20 to start of November 27) or something else?

Yep, I mean that. 7 whole days.

Johan added a comment.EditedNov 16 2016, 12:52 PM

Can we temporarily inactivate the global renaming feature?

Can we temporarily inactivate the global renaming feature?

I am not sure. There's a config flag for turning off requests but I couldn't find anything that disables renaming itself. @Legoktm @bd808 - would you know if there's a flag for disabling account renaming?

@Johan: It seems just informing them about is good enough. Disabling the feature is not worth the hassle.

Savh added a subscriber: Savh.Nov 17 2016, 1:41 PM

Depending on the risk of having a rename in this timespan, an easier form of disabling it is probably by removing the "centralauth-rename" right from the global-renamers and stewards user groups on meta temporarily.

Can we temporarily inactivate the global renaming feature?

I am not sure. There's a config flag for turning off requests but I couldn't find anything that disables renaming itself. @Legoktm @bd808 - would you know if there's a flag for disabling account renaming?

Like Savh said, just revoke the centralauth-rename right from the meta group, and the stewards global group.

Also, could someone explain why we need to turn of global rename during this time period? (Or a link in case I missed it)

K6ka added a subscriber: K6ka.Nov 17 2016, 10:05 PM

Couldn't we just edit https://meta.wikimedia.org/wiki/Template:Globalrenamequeue with a bright red notice to warn renamers not to rename? Someone is going to be seriously trouted if they somehow miss it.

@K6ka: That sounds like a reasonable idea. I don't think there's actually a lot of danger from people being renamed during the script run. I *think* the worst-case scenario is someone being renamed between the select query and the update query in the maintenance script, in which case they just wouldn't get the local id and global id fields populated. But we could also run the maintenance script twice to patch any holes (it only selects users that are missing the data). The hold on renames is mostly just to be extra cautious and avoid any problems.

Okay, it's running on prod now. Latest update:

Updated 100 records. Last user: 15491729; Wiki: angwiki
Niharika claimed this task.
K6ka added a comment.Nov 20 2016, 12:44 PM

A few users renamed past the 00:00 November 20 cutoff date.

11:26, 20 November 2016 Steinsplitter (talk | contribs) globally renamed Enviroman~frwiki to Jonathan duchesne (per request)
09:05, 20 November 2016 Steinsplitter (talk | contribs) globally renamed 千野聖広 to Pkdhefy (per request)
06:54, 20 November 2016 Mohamed Ouda (talk | contribs) globally renamed عراف الجبل to يوسف عاشق الجبل (Per :ar:وب:تام)
06:53, 20 November 2016 Mohamed Ouda (talk | contribs) globally renamed Bnikhald to ــــا (Per :ar:وب:تام)
06:45, 20 November 2016 Mohamed Ouda (talk | contribs) globally renamed AsadGhali1110 to AsadGhali0001 (Per :ar:وب:تام)
06:42, 20 November 2016 Mohamed Ouda (talk | contribs) globally renamed Zahraa.k.hameed to AsadGhali2200 (Per :ar:وب:تام)
00:26, 20 November 2016 Jmvkrecords (talk | contribs) globally renamed Manuel Duro Lopes to Manuel Paulino Duro Lopes (per request)

Not sure if this broke any accounts, but it would probably seem that having a big red box wasn't enough to stop people from pressing the button.

Steinsplitter added a subscriber: Steinsplitter.EditedNov 20 2016, 12:45 PM

A few users renamed past the 00:00 November 20 cutoff date.

11:26, 20 November 2016 Steinsplitter (talk | contribs) globally renamed Enviroman~frwiki to Jonathan duchesne (per request)
09:05, 20 November 2016 Steinsplitter (talk | contribs) globally renamed 千野聖広 to Pkdhefy (per request)
06:54, 20 November 2016 Mohamed Ouda (talk | contribs) globally renamed عراف الجبل to يوسف عاشق الجبل (Per :ar:وب:تام)
06:53, 20 November 2016 Mohamed Ouda (talk | contribs) globally renamed Bnikhald to ــــا (Per :ar:وب:تام)
06:45, 20 November 2016 Mohamed Ouda (talk | contribs) globally renamed AsadGhali1110 to AsadGhali0001 (Per :ar:وب:تام)
06:42, 20 November 2016 Mohamed Ouda (talk | contribs) globally renamed Zahraa.k.hameed to AsadGhali2200 (Per :ar:وب:تام)
00:26, 20 November 2016 Jmvkrecords (talk | contribs) globally renamed Manuel Duro Lopes to Manuel Paulino Duro Lopes (per request)

Not sure if this broke any accounts, but it would probably seem that having a big red box wasn't enough to stop people from pressing the button.

There was no warning box at all at Gobal rename queue, etc.

Edit: there was no warning because it hasen't been marked for translation. Thus it was just on the main template but not on /subpages. such as /en etc.

Okay, can we mark it for translation now? I don't know how to do that. @Steinsplitter, @K6ka - anyone?

Okay, can we mark it for translation now? I don't know how to do that. @Steinsplitter, @K6ka - anyone?

I did it.

Curiously, the script flew through the first few wikis but when populating records for arwiki, it was noticeably slow (maybe because arwiki is pretty big?). The script terminated abruptly at the end of it. This was the error:

Updated 46 records. Last user: 48415876; Wiki: arwiki
Set $wgShowExceptionDetails = true; in LocalSettings.php to show detailed debugging information.

I've restarted the script now and it picked from where it began.

Also, I'm surprised $wgShowExceptionDetails isn't true by default on prod.

Also, I'm surprised $wgShowExceptionDetails isn't true by default on prod.

$wgShowExceptionDetails is not a production setting; it is a local debugging setting. It adds backtrace information to the fatal error page. We should probably trace through the maintenance script and figure out what actually should be logged and how to give information about that on the console.

Change 322606 had a related patch set uploaded (by Dereckson):
Disable centralauth-rename right for maintenance

https://gerrit.wikimedia.org/r/322606

Reedy added a subscriber: Reedy.Nov 21 2016, 1:35 AM

Also, I'm surprised $wgShowExceptionDetails isn't true by default on prod.

It's broken -- T148957

We should probably trace through the maintenance script and figure out what actually should be logged and how to give information about that on the console.

Hmm, the script is pretty straightforward. I'm guessing the error could be caused by the select or update or getConnection calls failing. Not sure we can do something about those either way.

It encountered the same error once more after completing bxrwiki. It's running fine again now.

Change 322606 merged by jenkins-bot:
Disable centralauth-rename right for maintenance

https://gerrit.wikimedia.org/r/322606

Mentioned in SAL (#wikimedia-operations) [2016-11-21T06:11:10Z] <legoktm@tin> Synchronized wmf-config/: Disable centralauth-rename right for maintenance (T148242, T151155) (duration: 00m 52s)

Like Savh said, just revoke the centralauth-rename right from the meta group, and the stewards global group.

I'm disappointed this wasn't done ahead of time.

FWIW I don't think it's safe to continue running maint scripts in production if they are throwing exceptions and we aren't able to debug them.

@Legoktm, I'm not sure the errors were caused by the script itself. It's running fine now and I'm keeping a close watch on it in any case. If it errors out again, I'll stop it.

K6ka added a comment.Nov 27 2016, 12:49 PM

Any updates on this? It was mentioned at T151155 that renames will be disabled for longer than the 27th.

Niharika added a comment.EditedNov 27 2016, 12:52 PM

Any updates on this? It was mentioned at T151155 that renames will be disabled for longer than the 27th.

Hi @K6ka - the script is expected to run about 2-3 more days. It's covered a lot of the bigger wikis but still needs to populate values for the smaller ones. Sorry for the inconvenience caused.

The script ran up until labswiki and errored out. It won't restart. Here's the last few lines:

All users migrated; Wiki: kywikiquote 
All users migrated; Wiki: kywiktionary 
All users migrated; Wiki: labswiki 
Set $wgShowExceptionDetails = true; in LocalSettings.php to show detailed debugging information.

According to P5436 , Labswiki doesn't have any attached users, which might be a problem? That's the only anomaly with labswiki I could find.

bd808 added a comment.Nov 30 2016, 8:13 PM

The script ran up until labswiki and errored out. It won't restart. Here's the last few lines:

All users migrated; Wiki: kywikiquote 
All users migrated; Wiki: kywiktionary 
All users migrated; Wiki: labswiki 
Set $wgShowExceptionDetails = true; in LocalSettings.php to show detailed debugging information.

According to P5436 , Labswiki doesn't have any attached users, which might be a problem? That's the only anomaly with labswiki I could find.

labswiki is the horrible database name for wikitech which is not a SUL wiki. It also has its database on a server that can't be reached from terbium. It shouldn't be in your list of wikis to migrate.

Reedy added a comment.Dec 1 2016, 12:39 AM

Why is it even in $wgLocalDatabases ? This seems.. wrong

Reedy added a comment.Dec 1 2016, 12:39 AM

I guess, due to all.dblist? Bleugh

Careful about labtestwiki too

hashar added a subscriber: hashar.Dec 1 2016, 1:43 PM

https://gerrit.wikimedia.org/r/#/c/322667/ is meant to Re-enable 'centralauth-rename' rights for when maintenance is done. Has been added to European SWAT E381#3944. However it looks like the script hasn't completed, blocked on labswiki.

Masti added a subscriber: Masti.Dec 1 2016, 2:32 PM
APerson added a subscriber: APerson.Dec 5 2016, 4:49 PM
K6ka added a comment.Dec 6 2016, 3:55 AM

Any updates on this? The requests are starting to pile up, and it looks like we have quite the Christmas present this year! (Also, a number of users are querying me asking about the status of their requests)

@K6ka, @hashar: Since the script is suspended right now, we should re-enable the right in the meantime and let people clear the backlog. I'll put the config change on the SWAT schedule.

Change 322667 merged by jenkins-bot:
Re-enable 'centralauth-rename' rights for when maintenance is done

https://gerrit.wikimedia.org/r/322667

Mentioned in SAL (#wikimedia-operations) [2016-12-06T15:06:31Z] <zfilipin@tin> Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit:322667|Re-enable centralauth-rename rights for when maintenance is done (T148242 T151155)]] (duration: 00m 43s)

Mentioned in SAL (#wikimedia-operations) [2016-12-06T15:07:25Z] <zfilipin@tin> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:322667|Re-enable centralauth-rename rights for when maintenance is done (T148242 T151155)]] (duration: 00m 43s)

@K6ka: The rename right has been re-activated. Please let me know when the backlog is cleared as we may want to re-disable it for the 2nd script run.

K6ka added a comment.Dec 7 2016, 12:52 AM

@kaldari The backlog is mostly cleared now, still have some usurpation requests pending at m:SRUC that are scheduled to be processed within the next few days.

hashar removed a subscriber: hashar.Dec 7 2016, 1:31 PM

Mentioned in SAL (#wikimedia-operations) [2016-12-08T00:38:12Z] <dereckson@tin> Synchronized php-1.29.0-wmf.5/extensions/CentralAuth/maintenance/populateLocalAndGlobalIds.php: [[Gerrit:325733|Improve populateLocalAndGlobalIds maintenance script]] (T148242) (duration: 00m 46s)

Johan added a comment.Dec 8 2016, 6:13 PM

We'll start running the script again at 00:00 UTC (in about six hours), and will re-disable the renaming rights.

Change 325997 had a related patch set uploaded (by Kaldari):
Temporarily disable centralauth-rename right

https://gerrit.wikimedia.org/r/325997

K6ka added a comment.Dec 8 2016, 9:36 PM

Hopefully things go smoothly this time. Best of luck!

Change 325997 merged by jenkins-bot:
Temporarily disable centralauth-rename right

https://gerrit.wikimedia.org/r/325997

Johan moved this task from To Triage to In current Tech/News draft on the User-notice board.
Wasell added a subscriber: Wasell.Dec 9 2016, 8:41 AM

Script is still running. It's up to specieswiki now (going in alphabetical order).

kaldari closed this task as Resolved.Dec 15 2016, 6:09 PM

Done.

Mentioned in SAL (#wikimedia-operations) [2016-12-16T00:23:39Z] <dereckson@tin> Synchronized wmf-config/InitialiseSettings.php: Reenable centralauth-rename right (T148242) (duration: 00m 49s)

DannyH moved this task from Backlog to Archive on the Community-Tech board.Dec 20 2016, 9:38 PM