Page MenuHomePhabricator

Cannot log into Gerrit as of recent upgrade
Closed, ResolvedPublic

Description

IMPORTANT: If your having troubles logging in and are reciving similar error messages to 'Cannot assign user name "erik" to account 4136; name already in use.' then please read the following work arounds.
If you are inputting in the login form e.g. user (note the lowercase) then please try an upper case User. If you are trying upper case User please try it in lowercase user. This wont work for all users but this is our current workaround.

Update for after 27 july 2017

IMPORTANT: If a user is getting Cannot assign user name "erik" to account 4136; name already in use. then the gerrit: prefix from the external id table is missing. It has to be re added and a online reindex performed.

I successfully used Gerrit yesterday, but as of the upgrade today, I can no longer in (username 'eloquence'). I get the following error message:

'Cannot assign user name "erik" to account 4136; name already in use.'


Upstream resources:


This is currently affecting about 11 users:

gerrit> select account_id, full_name from accounts where account_id in (2964,3555,20,2394,4111,3327,790,2239,1984,278);
 account_id | full_name
 -----------+---------------
 20         | Eloquence
 278        | Parent5446
 790        | Rasel160
 1984       | Papaul
 2239       | Kaldari2
 2394       | Xujing1
 2964       | StudiesWorld
 3327       | SamanthaNguyen
 3555       | NULL
 4111       | Pppery
(10 rows; 3 ms)

What's happening is these users are lacking the second row in account_external_ids that they should have to map to their LDAP username. Trying to re-add the missing second row causes it to be deleted when the user tries to login (taking them back to one row). This is being tracked upstream, as there seems to have been a busted migration script (although we're past that) as well as an underlying issue with existing username detection (it's trying to recreate them).

If you are hitting this, perhaps trying a different capitalization works (for some people, at least). For example, if your username on wikitech is JUser, try logging in with JUser exactly and not juser. This is not working for everyone, but it is for some people.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 326150 merged by Dzahn:
[operations/puppet] Gerrit: Enable config localUsernameToLowerCase

https://gerrit.wikimedia.org/r/326150

Ok, not only is this issue fixed, but we've finally converted to case-insensitive logins so the post-outage confusion won't happen again.

Mentioned in SAL (#wikimedia-operations) [2017-06-07T00:21:11Z] <RainbowSprinkles> gerrit: rolled back to 2.13.4-13-gc0c5cc4742 from 2.13.8. T152640 rearing its ugly head again (login issues)

demon added a subscriber: tstarling.

Ugh. This came back with 2.13.8. @tstarling was unable to log in. After logging out, I could not log in again either. Casing didn't help, nor did reindexing accounts.

Change 357524 had a related patch set uploaded (by Chad; owner: Chad):
[operations/debs/gerrit@master] gerrit (2.13.8 git1-wmf.5) jessie-wikimedia; urgency=medium

https://gerrit.wikimedia.org/r/357524

Change 357524 merged by Dzahn:
[operations/debs/gerrit@master] gerrit (2.13.8 git1-wmf.5) jessie-wikimedia; urgency=medium

https://gerrit.wikimedia.org/r/357524

This is a blocker for gerrit 2.14 as well. This is apparently also happening for gerrit 2.14 per luca comments at https://gerrit-review.googlesource.com/#/c/92830/ (where i wrote that the patch needs to be reopened and merged)

This task should stay open until they fixed the bug for all the releases otherwise we will hit this again.

Actually this doesn't block 2.14. In 2.14 you can reindex which fixes it for them. Our problem is different to luca's. @demon could you dig on why this is doing this please?

Mentioned in SAL (#wikimedia-cloud) [2017-06-07T18:06:27Z] <paladox> gerrit-test3 saving backup of gerrit 2.14 and downgrading to 2.13.8 to try and reproduce T152640

Mentioned in SAL (#wikimedia-cloud) [2017-06-09T16:48:39Z] <paladox> moving gerrit-test3 back to gerrit 2.14.1 for testing after finding a possible fix for T152640

Luca found the leading cause to this. It's because we did not do the full reindex properly which did not create the account index. Another user did a full reindex which fixed it for them. See https://groups.google.com/forum/#!topic/repo-discuss/_5iJcIsIa2Y

Ok, I guess we can reindex....again.

Paladox raised the priority of this task from High to Unbreak Now!.Jun 23 2017, 5:57 PM

Setting unbreak as @MarcoAurelio can't access his account. Please lower it if you think i've set it wrong for priority.

greg lowered the priority of this task from Unbreak Now! to High.Jun 23 2017, 6:00 PM
greg subscribed.

Setting unbreak as @MarcoAurelio can't access his account. Please lower it if you think i've set it wrong for priority.

as far as we know it's only one user right? in that case, not UBN. Doesn't mean it's not important/being looked into. This just isn't a major incident.

It's not urgent indeed. I'll be around IRC for the next hour or two, so if you need me to try something poke me :) Thanks.

@demon would you be able to have a look in the table to see if it's like the other please? So that i can bring the findings upstream so they can investigate it more.

https://groups.google.com/forum/#!topic/repo-discuss/_5iJcIsIa2Y

gerrit> select * from accounts where full_name = 'MarcoAurelio';
 registered_on         | full_name    | preferred_email    | inactive | account_id
 ----------------------+--------------+--------------------+----------+-----------
 2016-07-25 01:42:27.0 | MarcoAurelio | strigiwm@gmail.com | N        | 787
(1 row; 8 ms)

gerrit> select * from account_external_ids where account_id = 787;
 account_id | email_address | password | external_id
 -----------+---------------+----------+------------------
 787        | NULL          | NULL     | username:maurelio
(1 row; 1 ms)

Looks like we're missing an entry like before :\

@demon thank you for doing that :)

i will share your findings to upstream hopefully they will manage to fix it.

Anything I have to do on my part or just keep waiting? :) Regards.

@demon and @Paladox -- It'd be strange if only me were affected by this. Maybe you should scan the database and see how many entries are there missing the same values? Thanks.

@demon and @Paladox -- It'd be strange if only me were affected by this. Maybe you should scan the database and see how many entries are there missing the same values? Thanks.

It never affected more than a few people at once and an underlying cause of "why me but not this other person" has never been figured out. So no, not so strange :\

Ok, got @MarcoAurelio logging in again (yay). Hopefully that's the last busted entry until we upgrade to the 2.14.x series and the bug goes away for good.

demon lowered the priority of this task from High to Medium.Jun 29 2017, 5:46 PM

Not going to close just yet because it might easily happen again until we upgrade to 2.14.2+, but lowering priority because immediate issue is fixed.

From T169996

Cannot assign user name "reception123" to account 5067; name already in use.

Which is still happening today:

Cannot assign user name "reception123" to account 5117; name already in use.
Cannot assign user name "reception123" to account 5118; name already in use.

For the record, I've tried both upper-case and lower-case and I get the exact same error.

Cannot assign user name "reception123" to account 5120; name already in use.
Cannot assign user name "reception123" to account 5122; name already in use.

It’s because he is missing an record from the external id table

select * from accounts where full_name = '<name>';

Then look in the account_id column

Then do

select * from account_external_ids where account_id = <account_id>;

There should be two columns there.

<snip>
Login with the upper case first letter seems to generate in Gerrit error log:

Cannot assign user name "addshore" to account 4238; name already in use.
Cannot assign user name "addshore" to account 4239; name already in use.

Sounds like a nasty case normalization. I remember it has hit me before with labels and Zuul. In Gerrit the database.url has: connectionCollation=utf8_unicode_ci , or search are case insensitive?

The bug for Zuul was T106596 with a gerrit review command --label Verified=-1 worked but --label verified=-1 caused a key issue. And at the time I suspected:

Our Gerrit MySQL Connection has: connectionCollation=utf8_unicode_ci where ci stands for case insensitive.

Then it is solely for collation, so I am not sure whether it is related to the account case mismatch issue.

@hashar this issue is related to a missing entry in the external_id table. It should fix the users problem once the missing entry is re added.

Reception123 is not missing from the database.

Adding to the record we managed to fix Reception123 account. It was missing the gerrit: prefix in the entry in the external id table.

@demon My account worked for a while, but I just tried to log in and got the same error.

And after I got logged out, and try to log in again, the issue is back :( Probably the gerrit: prefix again.

Added again, flushed caches. (Sorry for delay, was on vacation)

if the issue is not fixed in 2.14, there is a higher chance 2.15 will include the fix as upstream merged a patch that tracks if the account in the index is stale. Also gerrit 3.x is dropping reviewdb so there's an even higher chance it will be fixed once reviewdb is removed as then they won't need to keep the data in two place. 2.13 was the start but 2.14 and 2.15 are the middle but 2.15 migrates all accounts to notedb.

A change was merged in 2.13 to hopefully fix this problem by looking at the db instead of the index.

See also https://groups.google.com/forum/m/#!topic/repo-discuss/rP3DdKXxHbI

This problem may not been fully fixed in 2.14 but upstream expect a lot of these issues to be fixed in 2.16 or 3.0.

*eyeroll*

It'll be fixed when they stop putting canonical data in a secondary index.

Should be all better now (we hope).

Kipcool subscribed.

Hi, it seems that I am also affected by the same issue:

  • Cannot assign user name "kipcool" to account 6839; name already in use.

I have not logged in for several months, so maybe there is something new to do that I don't know, or maybe my account has been buggy for quite some time. Have tried "kipcool" or "Kipcool", same result.

I can log in fine at wikitech. Only gerrit won't let me in.

@Kipcool this task is unrelated. That would be T197083 instead. It is better to just fill another task which I did as T216605