Log from meeting about RFC: Allow user login with email address in addition to username
ActivePublic
Actions

Authored by • RobLa-WMF on Apr 16 2016, 12:58 AM.

Tags

Referenced Files

	F3880784: Log from meeting about RFC: Allow user login with email address in addition to username
	Apr 16 2016, 12:58 AM

Subscribers

None

	00:01:23 <TimStarling> #startmeeting RFC meeting
	00:01:23 <wm-labs-meetbot> Meeting started Thu Oct 1 00:01:23 2015 UTC and is due to finish in 60 minutes. The chair is TimStarling. Information about MeetBot at http://wiki.debian.org/MeetBot.
	00:01:23 <wm-labs-meetbot> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
	00:01:23 <wm-labs-meetbot> The meeting name has been set to 'rfc_meeting'
	00:01:48 <robla> #link https://phabricator.wikimedia.org/E74
	00:02:15 <TimStarling> #topic Allow user login with email address in addition to username \| Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) \| Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/
	00:03:11 <robla> #link https://phabricator.wikimedia.org/T30085
	00:03:52 <TimStarling> devunt: how is the implementation going?
	00:04:31 <TimStarling> I see there's been no new patchsets in the last week
	00:04:53 <devunt> I was busy last week
	00:04:58 <devunt> codes in the gerrit is the latest version
	00:05:18 <TimStarling> fair enough
	00:05:21 * robla waves at devunt
	00:05:23 <TimStarling> csteipp: are you here?
	00:05:45 <csteipp> Yeah
	00:05:55 <Katie> TimStarling: Given that "@" is generally disallowed in usernames on MediaWiki wikis, what other issues are there from using a single field for both e-mail address or username?
	00:06:33 * robla didn't think to put it on csteipp's calendar
	00:06:36 <TimStarling> well, csteipp's concern with the original patchset was that attackers can use brute force to find valid email addresses
	00:06:52 <TimStarling> if we give feedback about the email address being associated with a user
	00:07:11 <TimStarling> that has been addressed, but it means losing some user feedback and thus usability
	00:07:37 <TimStarling> if the user types their email address into the box, and their password was wrong, we can't tell them that, we can only say "the email or password was wrong"
	00:08:15 <TimStarling> there is still the possibility of a timing attack, but maybe enough has been done for csteipp to remove his -2?
	00:08:54 <csteipp> Yeah, I need to review the current patchset (sorry, haven't gotten to it)
	00:09:19 <TimStarling> I've suggested some changes on gerrit and some more changes via private message
	00:09:29 <csteipp> We're only checking accounts where there's exactly 1 account with that email, and the email is confirmed, right?
	00:09:30 <bd808> not disclosing if the account exists or not is pretty standard on failed web logins I think. It shouldn't be a huge problem
	00:09:31 <TimStarling> and my changes will probably make the timing attack more severe
	00:09:32 <devunt> The current implementation does not have the possibility of a timing attack
	00:09:37 <devunt> unless I did something wrong
	00:09:46 <Katie> TimStarling: We're still gaining a lot of usability by offering login via e-mail address functionality.
	00:10:04 <Katie> Like I see what you're saying, but I don't think the vague error messages matter very much.
	00:10:16 <TimStarling> the question is: how do you log in with CentralAuth or some other auth plugin?
	00:10:56 <TimStarling> csteipp should be thinking "holy crap" right now if he remembers what the CA code looks like ;)
	00:11:01 <Katie> Are we keeping CentralAuth around?
	00:11:17 <Katie> Does devunt care about supporting it?
	00:11:19 <csteipp> Timing attack is someone can guess email address and (known wrong) password, and the check is slower if the email exists, right?
	00:11:29 <devunt> I'm going to make some edits on my code to make it works with CentralAuth
	00:11:34 <TimStarling> yes, we're keeping CA for now
	00:13:05 <devunt> It checks the existence of an email address and if it doesn't exists, just hash a dummy string with the same process as saving a new password
	00:13:23 <bd808> well....
	00:13:23 <bd808> you wait for AuthManager :)
	00:13:23 <bd808> oh yes
	00:13:23 <bd808> CA is not going anywhere
	00:13:24 <bd808> Even in the post-SUL world CA does a lot for us
	00:13:26 <devunt> it should costs a same amount of time
	00:13:41 <csteipp> devunt: cool!
	00:13:50 * bd808__ switches to an account that isn't so horribly lagged
	00:14:11 <TimStarling> so the design options for CA are either to extend the idea of login via email address into $wgAuth and the hook interface
	00:14:31 <TimStarling> or to map the email address to a username early, and maintain the same login by username interface in the backend
	00:14:48 <TimStarling> and I am leaning towards the latter since it is less code to touch
	00:15:22 <bd808__> Anomie, tgr and I really want to get AuthManager done this quarter and it will open up more possibilities for this kind of alternate auth scenario
	00:15:24 <TimStarling> but then you don't have a backend login attempt if there is no username
	00:16:07 <Katie> bd808__: I don't think we should block devunt based on wants, even really wants.
	00:16:36 <Katie> TimStarling: Mapping early seems fine.
	00:17:12 <Katie> The idea in the initial implementation is to cover the most basic case, in my understanding. A unique e-mail address with a valid password.
	00:17:29 <bd808__> it could be wedged into CA in the same places I wedged in the ability to login with a pre-SUL rename user name I think
	00:17:55 <Katie> A unique e-mail address with a valid password using MediaWiki core.
	00:17:56 <bd808__> it's gross but there is code there that does it
	00:18:12 <csteipp> So yeah, figuring out how this goes into CentralAuth is going to be a challenge-- legoktm and I should probably figure that out. So this wont work on WMF sites until we do that.
	00:18:38 <TimStarling> and devunt is doing this for WMF, so it is a blocker
	00:18:59 <Katie> Oh, is he? I thought this was a volunteer project.
	00:19:00 <devunt> mainly for WMF
	00:19:25 <TimStarling> it is a volunteer project but volunteers always have specific motivations
	00:20:21 <devunt> this is (obviously) based on the task T30085,
	00:20:36 <tgr> AuthManager intends to fully replace wgAuth so it might be worth checking it out to avoid working twice
	00:20:38 <devunt> and the author of that task said "especially on a big project like Wikimedia"
	00:20:53 <csteipp> Good to know. So yeah, we'll have to figure how to make CA do similar-- is there a task for that part specifically?
	00:21:04 <tgr> although I don't expect the actual authentication code that handles the email address would have to change much
	00:21:18 <Katie> What needs to change in CentralAuth?
	00:21:24 <Katie> If you map early, why does CentralAuth care?
	00:21:42 <TimStarling> CA manages email addresses, the user_email field is not used
	00:21:51 <csteipp> ^ that
	00:22:05 <TimStarling> so if we map early, CA needs to hook into the mapping function
	00:22:07 <Katie> Oh, really? Ugh.
	00:22:09 <csteipp> Have to get CA's version of the email searchable
	00:22:23 <devunt> So I'm going to mapping function hookable
	00:22:25 <csteipp> Because on most wikis, global accounts shoudn't have an email in the local wiki DB
	00:22:32 <devunt> for CentralAuth or any other plugins
	00:22:33 <csteipp> (there's a bug for that)
	00:22:42 <TimStarling> if you want a good reason for mapping early, maybe look at autocreation
	00:22:44 <Katie> Not to get off-topic too much, but why is that a good thing?
	00:23:12 <Katie> Like why does CentralAuth, in a post-SUL world, need to do its own special thing here and store e-mail addresses separately?
	00:23:16 <csteipp> Limit the number of places we store private data
	00:23:33 <TimStarling> autocreation by email has to work, if you allow login by email, that is critical for usability
	00:23:40 <tgr> in general mapping early seems like the wrong way to this
	00:23:46 * csteipp has to leave.. I think the general direction of the patch now looks good, so -2 removed.
	00:24:02 <tgr> how do you avoid timing attacks if you can't even map non-existent email addresses to users?
	00:24:03 <csteipp> ^ for core. We need to figure out CA.
	00:24:12 <devunt> csteipp, thanks!
	00:24:21 <devunt> And about user experiences,
	00:24:54 <TimStarling> I don't think you can completely avoid timing attacks
	00:25:08 <devunt> I initially designed the messages to work as same as github does.
	00:25:17 <TimStarling> but CA could provide a dummy-login function which is called when there is no username for an email address
	00:26:21 <tgr> anyway AuthManager when used properly will separate most concerns (such as autocreation) so you would only have to write the code to detect when the content of the username field is an email address and fetch the (central)user object based on that
	00:26:53 <tgr> that would also be reasonably safe timing-wise, assuming that searching the DB by username and by email take a similar amount of time
	00:27:38 <tgr> on second thought you don't really need to assume that
	00:28:24 <bd808__> as long as the timing for the email path is always the same you are ok for timing attacks I think
	00:28:25 <TimStarling> to be safe from timing attacks, the search function needs to be equally fast when there are zero and one matches
	00:28:43 <TimStarling> how do you take out the mysql row construction overhead?
	00:28:54 <Katie> Can we buffer the time?
	00:28:56 <devunt> I thought we can ignore db querying time, can't we?
	00:29:06 <TimStarling> I think it's really not critical, security-wise
	00:29:14 <tgr> create a dummy row, make it 1 or 2 results instead?
	00:29:45 <bd808__> User::newFromEmail() as written now will leak how many accounts are attached I think
	00:29:46 <TimStarling> like I say, I don't think you can eliminate timing altogether
	00:29:52 <TimStarling> I think you can try to limit it to say 1ms
	00:30:00 <tgr> or use a LIMIT to make it always return one result which is sometimes a dummy with a password that never matches
	00:30:20 <TimStarling> then you can start to talk about realistic architectures instead of accounting for every clock cycle
	00:31:06 <TimStarling> what we want is for the time difference to be less than the typical timing noise
	00:31:25 <devunt> I think variations of server response time is much larger than variations of sql querying time, so we can ignore them.
	00:31:40 <TimStarling> I mean, we are only talking about leaking email addresses, which MW does routinely via Special:EmailUser
	00:32:22 <TimStarling> we also publish people's email addresses in various other places, like mailing lists and gerrit
	00:32:47 <tgr> devunt: as long as that variation has nice properties (e.g. close to normal distribution) an attacker can filter it out
	00:33:12 <tgr> I'm not claiming that is an issue significant enough to block the change, just saying that it exists
	00:33:15 <TimStarling> the question is whether a spammer would bother to run a thousand queries against a given email address, to average out the times and try to derive a measure of probability
	00:33:25 <TimStarling> or whether they would find easier targets
	00:34:10 <tgr> TimStarling: I think the more problematic attack model is trying to deanonimize users when you have a suspicion who they might be and what email address they might be using
	00:34:11 <TimStarling> oh, and if an attacker can run thousands of login attempts, would they not try brute-forcing the password?
	00:34:21 <TimStarling> logins are already rate-limited by IP
	00:34:57 <TimStarling> well, if you suspect what their email address is, you could just send mail to it
	00:35:16 <TimStarling> if you have 100 ideas about what their email address might be, you could send mail to all of them
	00:36:06 <TimStarling> anyway, that is my position, be practical, benchmark it, don't try to count every cycle
	00:36:31 <TimStarling> if the benchmarks show a problem then we can find a solution
	00:36:47 <devunt> I agree with TimStarling that there is no way to mitigate timing attack completely
	00:38:58 <devunt> and there is more than a hundreds of thousands ways to find a someone's email address to make a list of email addresses for spamming
	00:39:11 <TimStarling> #info <csteipp> So yeah, figuring out how this goes into CentralAuth is going to be a challenge-- legoktm and I should probably figure that out.
	00:39:36 <robla> devunt: I think there are privacy laws that don't acknowledge your point
	00:39:53 <Katie> And?
	00:40:06 <robla> Katie: laws we need to comply with
	00:40:15 <TimStarling> there are privacy laws in the US?
	00:40:25 <Katie> If the legal team has a problem, I have faith in them to say something.
	00:41:10 <robla> TimStarling: IANAL....I do know the state of California has many
	00:41:14 <tgr> again, you might want to make this a part of the AuthManager rewrite instead of doing it twice
	00:41:28 <TimStarling> well, if we are claiming that the email address is private data, we should stop doing that
	00:41:29 <Katie> What is AuthManager's status?
	00:41:46 <TimStarling> since like I say, we give it out routinely via Special:EmailUser
	00:41:47 <tgr> https://phabricator.wikimedia.org/T110283 is the rewrite task for CentralAuth although it's not very informative currently
	00:42:15 <tgr> it's a Q2 goal I believe - bd808 / bd808__ are those final now?
	00:42:24 * robla doesn't know exactly what our privacy policy says about email addresses as PII
	00:42:24 <Katie> Q2 means quarter two of what?
	00:42:50 <bd808__> by the end of the calendar year (Q2 of WMF fiscal)
	00:42:52 <tgr> sorry, that's Oct-Dec this year
	00:43:26 <Katie> MediaWiki won't be the, uhh, first software package to have login via e-mail address functionality. The laws of California really aren't relevant here unless the Wikimedia Foundation legal team comes forward and says so.
	00:44:26 <devunt> do we have to rewrite this login-via-email codes after AuthManager was implemented in core?
	00:44:36 <TimStarling> oh, I see we've lost csteipp
	00:44:37 <tgr> again, I wouldn't worry about spammers, I would worry about, say, some Chinese dissident being identified based on email address
	00:44:50 <tgr> but TimStarling is right that there are much easier attacks to make
	00:44:55 <Katie> MediaWiki's login usability is horrible. We need login via e-mail address and hopefully case-insensitive login one day. I think timing attacks and legal arguments are mostly silly distractions.
	00:45:03 <bd808__> I helped give voice to them, but I think the timing attack problem is mostly FUD
	00:45:27 <Katie> We can reasonably mitigate, but yeah.
	00:45:37 <TimStarling> so csteipp thinks he and legoktm should have input into the CA interface
	00:45:38 <bd808__> In practice account disclosure on a public wiki is not a huge problem
	00:45:55 <tgr> devunt: all authentication code has to be rewritten to adopt AuthManager
	00:46:04 <tgr> well, not so much rewritten as reorganized
	00:46:20 <devunt> and AuthManager is going to be implemented in this winter?
	00:46:42 <tgr> that's the plan I believe
	00:46:55 <robla> bd808__: good point about timing attack stuff. I just wanted to make sure we didn't glibly dismiss emails as public info, which it sounds like we aren't
	00:47:00 <TimStarling> so tgr, you are telling devunt to wait, not to implement and migrate?
	00:47:09 <Katie> I really, really don't want to see work here stall as the result of potential future goals.
	00:47:31 <TimStarling> because I think it is a pretty simple feature and can be done long before December
	00:47:50 <bd808__> AuthManager has been in progress for the last 9 months at various rates of speed. Anomie, Tgr and I will be focusing on it as our major body of work for the next 3 months. That should be enough to get it into production
	00:48:50 <bd808__> TimStarling: I'd say go ahead with it. we can catch authmanager up later
	00:49:08 <devunt> T30085 can be done in 2 weeks or before the ends of this month, I think.
	00:49:13 <tgr> TimStarling: not necessarily, but he should look at the AM project and check it's not done in a way that's going to be hard to port
	00:49:16 <bd808__> but in theory things like this will be easier to accomplish in the world where authmanager exists
	00:49:51 <Katie> In the future, everything will be simpler and better, yes. Today, it'd be really nice to have working login via e-mail address. :-)
	00:49:58 <TimStarling> well, it sounds like login by email should be included in the AM design as soon as possible
	00:50:32 <tgr> the design is very generic and definitely allows for it
	00:51:00 <tgr> it basically makes it the responsibility of the auth plugin what kind of fields they would like to manage
	00:51:14 <TimStarling> that is an action item for someone
	00:51:23 <bd808__> in the AM model it would be up to each primary provider (eg CA) to handle the input. It breaks things more cleanly than the core code does today
	00:51:53 <TimStarling> devunt's patch will have frontend stuff, like changing form labels and error messages, which you will need anyway
	00:52:27 <devunt> And I can re-use the codes when migrating this feature into AM after AM is implmented
	00:52:54 <TimStarling> the part which needs to be rewritten should be <100 lines
	00:53:19 <devunt> thus I think writting codes now is not a waste of time, though
	00:53:38 <devunt> Am I wrong about this?
	00:53:55 <tgr> #action reading-infrastructure make sure email login is not a problem for T110278 and T110283
	00:54:12 <bd808__> devunt: no. I think you should cary on while you have the time and inclination to work on the feature
	00:54:53 <TimStarling> ok, anything else in the remaining 5 minutes?
	00:55:17 <TimStarling> I think we are pretty much all done
	00:55:26 <Katie> devunt: I'm really glad you're working on this. I think it will be a major improvement in MediaWiki's login usability.
	00:55:47 <robla> agree with Katie, this will be great!
	00:55:49 <devunt> we still have one major problem
	00:56:07 <devunt> mediawiki allows duplication of email address
	00:56:10 <devunt> how can I handle it?
	00:56:32 <TimStarling> I have an idea for this
	00:56:44 <legoktm> sorry, I'm late. /me reads up
	00:56:54 <tgr> rejecting multi-user emails is good for the first version IMO
	00:56:56 <TimStarling> presumably when the same email address is used, it is usually the same person
	00:57:02 <Katie> I agree with tgr.
	00:57:16 <TimStarling> and thus, in most cases, the same password
	00:57:32 <bd808__> legoktm: tl;dr csteipp volunteered you to help figure out CA intergration :)
	00:57:42 <TimStarling> so if there is a duplicate email, we can just pick an account randomly and validate the supplied password against it
	00:58:15 <TimStarling> if it matches, then we can report a user-friendly error message
	00:58:15 <devunt> the current implementation will only process if email address doesn't connected with multiple accounts
	00:58:51 <TimStarling> like "this email address is used by more than one account, please log in with your username"
	00:59:26 <legoktm> a good chunk of CA is just core copied into a hook that uses a different database
	00:59:31 <TimStarling> if the password doesn't match, then we need to give the generic WRONG_INPUT message which doesn't disclose the presence of any accounts
	00:59:31 <devunt> I think it is bad pattern in some ways
	00:59:36 <legoktm> I don't imagine it will be difficult, just the timing stuff
	00:59:58 <Katie> TimStarling: People may still complain about that type of error message violating user privacy.
	01:00:03 <devunt> choosing randomly is not a good idea to me
	01:00:17 <devunt> and I think it is not a good design pattern
	01:00:23 <legoktm> +1 on skipping the multiple users having the same email thing for now
	01:00:40 <Katie> We can just reject for now and figure out the harder cases later, right?
	01:00:45 <TimStarling> well, if two people share the same email address, they can reset each others' passwords if they like
	01:01:04 <TimStarling> they can't really keep anything private from each other
	01:01:23 <Katie> Hmmm, right. You said only if the password works. That seems fine.
	01:01:53 <Katie> We need a good test suite here. There are a lot of cases.
	01:01:53 <tgr> the only drawback is that if the password is not the same everywhere you get somewhat random behavior
	01:01:57 <TimStarling> yeah, if the password doesn't match, you have to assume it is an attacker scanning for valid email addresses
	01:02:37 <devunt> yes I was worried about what tgr mentioned
	01:02:45 <tgr> which I don't think is unusual e.g. people use the same email for their bot but set a different password
	01:02:52 <tgr> probably not a big deal though
	01:03:06 <legoktm> or people who use the same email and password for multiple bots >.>
	01:03:37 <devunt> I think we have a basicially 3 choices here
	01:04:03 <devunt> 1. Only accepts if there is only one email address in the database
	01:04:26 <devunt> 2. Checking the password of randomly selected account (as TimStarling mentioned)
	01:04:47 <devunt> 3. Checking passwords of all accounts that have the same email address
	01:04:51 <tgr> TimStarling's suggestion is still 1 just with an improved error message
	01:05:04 <TimStarling> we can continue this discussion on phabricator
	01:05:24 <TimStarling> I am fine with devunt's implementation (1) for now
	01:05:29 <devunt> tgr, oh, yes, it is.
	01:05:38 * robla ducks out
	01:05:54 <TimStarling> #endmeeting

Event Timeline

• RobLa-WMF created this paste.Apr 16 2016, 12:58 AM

• RobLa-WMF mentioned this in E74: Extra RFC Meeting on #wikimedia-office IRC channel (2015-10-01 UTC/2015-09-30 PDT).Apr 16 2016, 1:00 AM

Log from meeting about RFC: Allow user login with email address in addition to usernameActivePublicActions

Event Timeline

Log from meeting about RFC: Allow user login with email address in addition to username
ActivePublic
Actions