Page MenuHomePhabricator

Mask mailaddress during login that triggers EmailAuth
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Imagine you are presenting at a conference. You have to login to wikimedia for a demo.
  • you login
  • you hit EmailAuth (because of the conference IP) even though that never happened to you before.
  • you get this:

MediaWiki_EmailAuth_verification_form.png (556×600 px, 82 KB)

  • you might now have unintentionally revealed your email address to everyone at the conference.

What happens?:
Unintentional disclosure of private information

What should have happened instead?:
Show the emailaddress as: g**a@w****a.org

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):

Relevant code: https://w.wiki/Dn99

Event Timeline

I wonder if something similar should be applied to the email shown by default on Special:Preferences...

I wonder if something similar should be applied to the email shown by default on Special:Preferences...

Probably, but at least there you can sort of know that it consistently shows whenever you go to Preferences. Here it can randomly popup without ever having done so for you before.

I think it's far worse when you have no idea what email address you used to register to the site, and now you can't check because you can't login.

Maybe we could add a show / hide button or something.

An alternative perspective (that may or may not be worth doing something about; I'll leave that decision to other people!):

  • A malicious actor logs into an account with a compromised username and password.
  • The malicious actor's IP address is known to iPoid-Service, so MediaWiki-extensions-EmailAuth displays the "You need to verify your login" box.
  • The malicious actor may not have previously known what the email address linked to the account was (given they attempted the sign-in with a username and password), but they now do.
    • Potentially, the malicious actor may then attempt to log-in to the email address using the same compromised password (or, e.g., a password for that email account retrieved from a separate list of stolen credentials) -- which, if successful, could lead to the malicious actor proceeding to successfully compromise the MediaWiki account in question (using the code that's been sent to the email address).

Obviously, in normal circumstances, the email address linked to an account is fine to display to someone logged-in as that account. However, if MediaWiki is suspicious enough of a given login attempt to show the EmailAuth dialog, it presumably thinks that there is a chance that the person might be a malicious actor/not the actual account-holder, and therefore that it's worth verifying that the person logging in is actually entitled to do so. In these circumstances, imo, we don't want to assume that the person logging in is definitely someone entitled to do so until they've passed the EmailAuth verification - and, therefore, imo we probably shouldn't be revealing additional account information before that verification stage has been passed.

An alternative perspective (that may or may not be worth doing something about; I'll leave that decision to other people!):

  • A malicious actor logs into an account with a compromised username and password.
  • The malicious actor's IP address is known to iPoid-Service, so MediaWiki-extensions-EmailAuth displays the "You need to verify your login" box.
  • The malicious actor may not have previously known what the email address linked to the account was (given they attempted the sign-in with a username and password), but they now do.
    • Potentially, the malicious actor may then attempt to log-in to the email address using the same compromised password (or, e.g., a password for that email account retrieved from a separate list of stolen credentials) -- which, if successful, could lead to the malicious actor proceeding to successfully compromise the MediaWiki account in question (using the code that's been sent to the email address).

Obviously, in normal circumstances, the email address linked to an account is fine to display to someone logged-in as that account. However, if MediaWiki is suspicious enough of a given login attempt to show the EmailAuth dialog, it presumably thinks that there is a chance that the person might be a malicious actor/not the actual account-holder, and therefore that it's worth verifying that the person logging in is actually entitled to do so. In these circumstances, imo, we don't want to assume that the person logging in is definitely someone entitled to do so until they've passed the EmailAuth verification - and, therefore, imo we probably shouldn't be revealing additional account information before that verification stage has been passed.

I'm not saying this shouldn't be fixed but status quo right now is this: If the person has access to user and pass and can login to wiki, they can see the email address in Special:Preferences. So this is not exposing any new infoleak. Sure, it should be fixed but IMHO it shouldn't be treated as a data breach or major info leak.

I'm not saying this shouldn't be fixed but status quo right now is this: If the person has access to user and pass and can login to wiki, they can see the email address in Special:Preferences. So this is not exposing any new infoleak. Sure, it should be fixed but IMHO it shouldn't be treated as a data breach or major info leak.

+1. I'd also like to see some guidelines / specification for how an address should be masked. At a quick search this morning, I didn't find much in the way of standards for this. I did see some suggestions that it might be better to not show the email address at all.

I think not showing at all is a good idea. Beside that it's quite easy to implement, it's quite simple. I know it means the person needs to find out which email it has been sent to but number of people can check their main inboxes and if they can't figure out, they can send an email to ca@ or admins or whatever. I think the rate of that happening would be quite low.

The malicious actor may not have previously known what the email address linked to the account was (given they attempted the sign-in with a username and password), but they now do.

This is a good point.

On the other hand, for people who don't user their account much, it's very common that they won't know what email address they used for registration. And it's mostly those people EmailAuth will activate for, since we exclude recently used IPs / devices.

So not sure which is the less bad choice here.

FWIW, this is how FB does it:

grafik.png (187×371 px, 13 KB)

Change #1133421 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/EmailAuth@master] i18n: Add no email variant of login-message

https://gerrit.wikimedia.org/r/1133421

Change #1133421 merged by jenkins-bot:

[mediawiki/extensions/EmailAuth@master] i18n: Add no email variant of login-message

https://gerrit.wikimedia.org/r/1133421

Change #1133473 had a related patch set uploaded (by Reedy; author: Kosta Harlan):

[mediawiki/extensions/EmailAuth@wmf/1.44.0-wmf.22] i18n: Add no email variant of login-message

https://gerrit.wikimedia.org/r/1133473

Change #1133474 had a related patch set uploaded (by Reedy; author: Kosta Harlan):

[mediawiki/extensions/EmailAuth@wmf/1.44.0-wmf.23] i18n: Add no email variant of login-message

https://gerrit.wikimedia.org/r/1133474

Change #1133473 merged by jenkins-bot:

[mediawiki/extensions/EmailAuth@wmf/1.44.0-wmf.22] i18n: Add no email variant of login-message

https://gerrit.wikimedia.org/r/1133473

Change #1133474 merged by jenkins-bot:

[mediawiki/extensions/EmailAuth@wmf/1.44.0-wmf.23] i18n: Add no email variant of login-message

https://gerrit.wikimedia.org/r/1133474

I googled a bit but only found a bunch of discussion on the issue but nothing remotely authoritative. I think I'd go for

  • Masking the inside of the username (ie. all but the first and last character) with the accurate number of asterisks , e.g. john.doe becomes j******e. Accurate number of asterisks seems least confusing to the user, while still not all that unique (there are something like 500 combinations for up to 16 character usernames, for a large provider that will match to thousands if not millions of addresses).
  • Masking all but the first character of the domain name segments except the last one, (e.g. foo.bar.bazz.com becomes f**.b**.b***.com), except for the 100 or 1000 most common domainswhich are not masked. (Here's a list of 150, here's a list of 6000, I'm sure we can find more authoritative ones with a little effort.)

Apologies if this has already been thought of, but as this involves privacy concerns, would it be good to get WMF-Legal's thoughts on this/what to do here?

Apologies if this has already been thought of, but as this involves privacy concerns, would it be good to get WMF-Legal's thoughts on this/what to do here?

We have been in contact with them throughout the whole process. Thank you for emphasizing.

Apologies if this has already been thought of, but as this involves privacy concerns, would it be good to get WMF-Legal's thoughts on this/what to do here?

We have been in contact with them throughout the whole process. Thank you for emphasizing.

Ah, apologies for inadvertently suggesting something that’s already happening!

sbassett added a parent task: Restricted Task.Apr 10 2025, 8:22 PM

I googled a bit but only found a bunch of discussion on the issue but nothing remotely authoritative. I think I'd go for

  • Masking the inside of the username (ie. all but the first and last character) with the accurate number of asterisks , e.g. john.doe becomes j******e. Accurate number of asterisks seems least confusing to the user, while still not all that unique (there are something like 500 combinations for up to 16 character usernames, for a large provider that will match to thousands if not millions of addresses).
  • Masking all but the first character of the domain name segments except the last one, (e.g. foo.bar.bazz.com becomes f**.b**.b***.com), except for the 100 or 1000 most common domainswhich are not masked. (Here's a list of 150, here's a list of 6000, I'm sure we can find more authoritative ones with a little effort.)

There doesn’t seem to be a strict industry standard for this, so my suggestion would be to mask all but the first character of the username (using a real number of *, one per hidden character) while fully showing the domain name (e.g., g*****@wikimedia.org).

Since domains like gmail.com, outlook.com, or wikimedia.org are widely recognized, hiding them wouldn’t meaningfully improve privacy, and leaving them visible helps users quickly identify the correct email.

This approach might strike a good balance between stricter strategies used by banks (which often mask also the domain) and the more relaxed ones from big tech companies.

The concern with domain names was personalized ones like me@johndoe.name.

The concern with domain names was personalized ones like me@johndoe.name.

Yes, my thoughts on that are that in most real-world cases, users know their own domain and would struggle if it were masked. For example, someone using coolstartupmail.net would immediately identify it, but masking it to c*************.net would only create confusion.

The domain alone is also not particularly sensitive information compared to the username. Knowing that someone uses someobscuredomain.com should not provide much actionable data to an attacker.

Also, consistency is important for user experience: if some domains are masked and others aren’t, users might get confused (why is it hidden now?).

Masking all but the first character of the domain name segments except the last one, (e.g. foo.bar.bazz.com becomes f.b.b***.com), except for the 100 or 1000 most common domainswhich are not masked. (Here's a list of 150, here's a list of 6000, I'm sure we can find more authoritative ones with a little effort.)

This makes the most sense to me. I think personally-identifiable domains (we likely can't think of all of the potentially bad examples here) are a risk, even if small. It would be easy for me to guess something like j*****e@johndoe.name, so masking the domain makes sense. I don't think it would be too confusing for anyone and it would provide a modicum of additional security.

Hey Scott - I understand your more cautious point of view.

It seems that big tech companies typically don’t mask domains in verification screens, while banks sometimes do.

In the end, it comes down to whether we want to prioritize maximum privacy or prioritize usability and clarity for the majority. Personally, I lean toward the latter.

Change #1143874 had a related patch set uploaded (by Mmartorana; author: Mmartorana):

[mediawiki/extensions/EmailAuth@master] emailauth: Mask email address in login-message

https://gerrit.wikimedia.org/r/1143874

Speaking personally (FWIW), I probably lean in support of masking custom domains. In the context of (the fourth bullet-point of) T390780#10701367, if we didn't mask them, one of my worries would be that having sight of a personalised domain name could point a particularly dedicated malicious actor towards attempts to breach the (mail)server itself. Individually-run (mail)servers wouldn't necessarily be guaranteed to have all the latest (i.e., security-patched) software running at any given moment (e.g., in the same way that enterprise servers would be expected to), so (IMO) it's not necessarily unlikely that revealing a personalised domain name could result in an attacker successfully compromising the mailserver for that domain using a known software vulnerability. (Clearly any such servers should be updated to run the most recent/patched versions of their software, but my point here is that - in this situation - an attacker wouldn't necessarily have known the domain name that could be attacked in this way before EmailAuth gave it to them.)

If consistency would be a blocker to masking domain names in only some cases, maybe we should mask the domain name in all cases?

@EMill-WMF could you please let us know if you think the domain needs to also be masked? The proposed patch only masks the local part of the email address.

@EMill-WMF could you please let us know if you think the domain needs to also be masked? The proposed patch only masks the local part of the email address.

Alternatively, we could merge @mmartorana's change set as-is and add the domain-masking functionality in a follow-up change set.

@EMill-WMF could you please let us know if you think the domain needs to also be masked? The proposed patch only masks the local part of the email address.

Alternatively, we could merge @mmartorana's change set as-is and add the domain-masking functionality in a follow-up change set.

I’m in favor of this approach, as it gives us time to discuss all the implications in the meantime.

@EMill-WMF could you please let us know if you think the domain needs to also be masked? The proposed patch only masks the local part of the email address.

Yes, I think custom domains need to also be masked before this is deployed. This is giving an attacker a pretty significant piece of information, and this feature is designed to trigger where this is a higher-than-usual likelihood that this is not the real user who has the account.

Custom domains can be very identifying and are a big jumping-off point for other recon. They can potentially have WHOIS records attached (though hopefully they are masked if the user is privacy-conscious), and an attacker can poke via DNS at who their email provider is and other things about their domain, look for subdomains and other email addresses on the same host, etc. In general, it's easy to imagine them correlating with more privacy/security-conscious users (or enterprises).

I think there is a pretty strong security argument to be made to show no email at all, as mentioned above a couple times. It's obviously a usability hit, though. If we observe EmailAuth to be working smoothly in practice after we deploy custom domain masking, we should consider returning to this, removing the masking, and then see if it causes issues.

Since domains like gmail.com, outlook.com, or wikimedia.org are widely recognized, hiding them wouldn’t meaningfully improve privacy, and leaving them visible helps users quickly identify the correct email.

While I agree that gmail.com and outlook.com are widely recognized, wikimedia.org is highly identifying and we should mask it, as we should any other enterprise email domain. It may well be that many accounts associated with those emails include WMF as part of their username or bio, as is pretty standard practice for staff accounts, but that's something users should be affirmatively doing themselves and not something this feature should be doing for them.

@EMill-WMF - So just to confirm, did you want to mask all domains, or is there a list of popular domains you'd prefer to leave unmasked? The former is the simpler solution, but if there's a list of popular domains you'd like to leave unmasked, can you confirm that list here? @Tgr had suggested some options in T390780#10712923.

@EMill-WMF - So just to confirm, did you want to mask all domains, or is there a list of popular domains you'd prefer to leave unmasked? The former is the simpler solution, but if there's a list of popular domains you'd like to leave unmasked, can you confirm that list here? @Tgr had suggested some options in T390780#10712923.

Yes, I think if we're going to mask the hostnames due to heightened privacy concerns around custom domain names, we should have a small set of popular domains that remain unmasked.

@sbassett has done some domain pulling here based on our own database: https://phabricator.wikimedia.org/P76369 Looking at it, it makes me want to be really conservative about this. There are a lot of domains with hundreds of thousands of accounts associated with it where it clearly conveys the country associated with the user. That's mostly through the TLD, but even qq.com is a distinctly China-focused service where an inference could reasonably be made.

So I'm only confident about leaving unmasked the top 3 -- gmail.com, yahoo.com, and hotmail.com - that are globally recognized properties with an audience so geographically dispersed that you can't reasonably make an inference about location. And looking at the numbers, those 3 alone cover the overwhelming majority of user accounts. The next tier would be outlook.com and aol.com, though overall their numbers are much lower. I'm reticent to include icloud.com because someone could infer the devices the user owns. (outlook.com kind of does too.)

Update: There are a few comments on the masking patch from @Tgr and @A_smart_kitten which @mmartorana or I should be able to clean up this week and then get merged.

Change #1143874 merged by jenkins-bot:

[mediawiki/extensions/EmailAuth@master] emailauth: Mask email address in login-message

https://gerrit.wikimedia.org/r/1143874

Change #1166781 had a related patch set uploaded (by A smart kitten; author: A smart kitten):

[mediawiki/extensions/EmailAuth@master] Rename UnmaskedDomains config variable to EmailAuthUnmaskedDomains

https://gerrit.wikimedia.org/r/1166781

Change #1166781 merged by jenkins-bot:

[mediawiki/extensions/EmailAuth@master] Rename UnmaskedDomains config variable to EmailAuthUnmaskedDomains

https://gerrit.wikimedia.org/r/1166781

Is there anything preventing this task from being resolved at this point?

sbassett changed the task status from Open to In Progress.Jul 7 2025, 5:52 PM
sbassett triaged this task as Medium priority.

Is there anything preventing this task from being resolved at this point?

This is what I was planning to ask :) As far as I can see, I believe not. A decision was made to leave 4 domains unmasked (an increase on the 3 mentioned above due to googlemail.com being included along with gmail.com), and a patch to do that was merged. All domains except for those four should now be masked within the EmailAuth UI.

Change #1236318 had a related patch set uploaded (by A smart kitten; author: A smart kitten):

[mediawiki/extensions/EmailAuth@master] i18n: Remove the `emailauth-login-message-no-email` message

https://gerrit.wikimedia.org/r/1236318

Change #1236318 merged by jenkins-bot:

[mediawiki/extensions/EmailAuth@master] i18n: Remove the `emailauth-login-message-no-email` message

https://gerrit.wikimedia.org/r/1236318