Page MenuHomePhabricator

Can't login into Gerrit with a Wikimedia Developer account with non-unique email address
Open, Needs TriagePublic

Description

MediaWiki does not require emails to be unique, but Gerrit apparently does. Create a Wikitech account which uses the same email address as your other secondary wikitech account, try to log in into Gerrit with those credentials and you get a super unhelpful "Authentication failed". Recent example T270064.

Event Timeline

We should probably reject such registration on wikitech where it is easy to provide a non-useless error message.

Peachey88 renamed this task from Can't login into Gerrit with a Wikitech account with non-unique password to Can't login into Gerrit with a Wikitech account with non-unique email address.Dec 16 2020, 7:57 AM
Peachey88 updated the task description. (Show Details)

We should probably reject such registration on wikitech where it is easy to provide a non-useless error message.

You would also need to enforce this in Striker, and any future replacement for Wikitech & Striker for Developer account provisioning (T179463).

bd808 renamed this task from Can't login into Gerrit with a Wikitech account with non-unique email address to Can't login into Gerrit with a Wikimedia Developer account with non-unique email address.Dec 16 2020, 5:36 PM

Thanks for fixing the task description, clearly I was half asleep.

@bd808 is a wikitech fix worth it separately or should it be fixed everywhere at the same time? I can provide a patch for MediaWiki, no idea how to do it with Striker though.
(I wonder if it made sense to centralize things by having Striker use action=createaccount?)

I'm not sure that preventing duplicate emails in the LDAP directory is actually valuable at all. It seems more like an implementation quirk of the external account linkage in Gerrit that it is using a non-unique lookup token than a bug in the Developer account system itself. The backing LDAP directory enforces unique values for uid (shell user name) and cn (Wikitech user name). It really feels like Gerrit linkage should be based on one or the other of these and not mail which is non-unique.

Searching the LDAP directory for duplicate emails is a bit annoying, but it is pretty easy to make a check in Wikitech's db of attached Developer accounts for duplicate emails. At the moment select count(*) as dups, user_email from user group by user_email having dups > 1 order by dups asc; returns 912 rows with duplication counts ranging from 2 to 250. The vast majority of these are duplicate count == 2.

If we block email duplication going forward, do we also need some kind of historic cleanup? What do we do for folks who want/need bot accounts that they do not intend to use with Gerrit?

Gerrit does the mapping between its internal account and the LDAP account using the LDAP cn field normalized to lower case. The email is not involved there.

However since Gerrit 2.16, uniqueness of emails across accounts is enforced. Apparently it caused troubles with some external authentication system which might use an email as the id key. https://gerrit-review.googlesource.com/c/gerrit/+/169970 . The commit states that duplicate email would not cause any trouble when the external id is not an email (such as LDAP with cn) but there is no feature flag to disable the uniqueness enforcement.

For the wiki bot accounts, I guess one can use an email alias by appending to their mailbox name an extra string prefixed by + (ex: jane+wikitechbot@example.org get delivered to jane mailbox). Not all email providers support that, but the large majority probably do (gmail definitely does). That might be a good enough workaround.

For the wiki bot accounts, I guess one can use an email alias by appending to their mailbox name an extra string prefixed by + (ex: jane+wikitechbot@example.org get delivered to jane mailbox). Not all email providers support that, but the large majority probably do (gmail definitely does). That might be a good enough workaround.

Sure, requiring unique email addresses is not a problematic limitation. Having to figure out from a completely generic error message what is going on is a waste of time though.