Page MenuHomePhabricator

LoginNotify should inform users of the IP address of failed login attempts to their account
Open, HighPublic15 Estimated Story Points

Description

When someone tries to reset our password, be it ourselves or third parties, the IP address of the requestor of the password reset is sent to our inbox with the password reset email. LoginNotify should do the same.

I don't think there would be major privacy concerns as long as its noted where appropriate that trying to login on an account may disclose private data to the owner of that account, if that's not already covered by https://wikimediafoundation.org/wiki/Privacy_policy#To_Protect_You.2C_Ourselves_.26_Others. This will need to be checked with WMF Legal.

Similarly, unsuccessful logins should leave CU traces to prevent abuse, otherwise this feature can become a source of annoyance.

New Notification Data Form

Filling out this form will help developers and product people understand your idea and will provide the information required to implement it. To see examples of the types of answers required, have a look at this sample form. To understand unfamiliar terms, visit the glossary. 

Basic information

  - Purpose of the notification:  To inform the user about failed login attempts to his account
  - Notification name:  Unchanged, reusing notification-known-header-login-fail notification from LoginNotify
  - What triggers notification?: Login attempts
  - "Notice" or "Alert"?:  Alert
  - Notification type (standard, bundled, expandable bundle):  standard, I think (unchanged from the existing notification)

Wording

For a single message

  • Header: Unchanged
  • Body:  Added a new body which reads "IP address of the last login attempt: $1" where $1 is replaced with the IP address

For Bundled Messages

  • Main, bundling message:
  • Subsidiary, bundled message:

Links

  - Primary link target: None added
  - Primary link label (for email display only): None added

  - #1 secondary link target: 
  - #1 secondary link label:

  - #2 secondary link target:
  - #2 secondary link label:

Icon

  - Icon name:  Unchanged
  - Link to graphic/example: Unchanged

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@Niharika I am working on them in parallel. I have a broken patch (not on gerrit yet) for the Echo notification; the fact that Echo combines five failed attempts into one notification (so you get notified only at 5, 10, 15, 20, ... failed attempts) has made it a bit difficult because we can (rarely) have cases when 5 attempts are made from five different IP addresses. So I am working on a way to combine the IPs into one Echo message as well. My lack of familiarity with Echo is slowing me down, too.

@kaldari: correct.

@Niharika I am working on them in parallel. I have a broken patch (not on gerrit yet) for the Echo notification; the fact that Echo combines five failed attempts into one notification (so you get notified only at 5, 10, 15, 20, ... failed attempts) has made it a bit difficult because we can (rarely) have cases when 5 attempts are made from five different IP addresses. So I am working on a way to combine the IPs into one Echo message as well. My lack of familiarity with Echo is slowing me down, too.

We probably want to show them a range in that case and not all of the IPs individually. We can help brainstorm more on this if you'd like.

So I did think about range. Here are situations that I can think of, in which multiple IPs might have failed attempts for the same account:

  • Malicious: a user is trying to gain access to my account, but to circumvent MediaWiki safeguards, they are using a botnet or something like that
  • Grey area: a user is trying to log in via TOR; every attempt may be routed differently in TOR, hence a different IP each time.
  • Non-malicious: a user tries and fails on their phone, thinks "may be I just can't type my password correctly on the little phone screen", switches to laptop and fails again; phone and laptop on two different networks (e.g. one on cellular, one on home wifi)
  • Non-malicious: a user tries and fails from their computer, tries again later and fails again but by then their IP has been changed by their ISP (rare)

In all except the last, the various IPs are most likely not from the same subnet, so grouping them into one range won't be practical. If a malicious user tries to log into my account from 8.4.4.8, 147.11.57.19 and 220.36.55.35 there is no way we can group them. And if he happens to try from 220.36.55.35 and 220.102.19.48 it doesn't make sense to combine them into 220.0.0.0/8 because those IPs don't even belong to the same ISP (even though they both originate in Japan).

Lastly, one motivation behind showing the actual IP is that if a user keeps getting failed attempts from the same IP, they can report to CUs so they can conduct more investigation on that IP. Turning them into ranges may not be a great idea in that regard either.

So I did think about range. Here are situations that I can think of, in which multiple IPs might have failed attempts for the same account:

  • Malicious: a user is trying to gain access to my account, but to circumvent MediaWiki safeguards, they are using a botnet or something like that
  • Grey area: a user is trying to log in via TOR; every attempt may be routed differently in TOR, hence a different IP each time.
  • Non-malicious: a user tries and fails on their phone, thinks "may be I just can't type my password correctly on the little phone screen", switches to laptop and fails again; phone and laptop on two different networks (e.g. one on cellular, one on home wifi)
  • Non-malicious: a user tries and fails from their computer, tries again later and fails again but by then their IP has been changed by their ISP (rare)

In all except the last, the various IPs are most likely not from the same subnet, so grouping them into one range won't be practical. If a malicious user tries to log into my account from 8.4.4.8, 147.11.57.19 and 220.36.55.35 there is no way we can group them. And if he happens to try from 220.36.55.35 and 220.102.19.48 it doesn't make sense to combine them into 220.0.0.0/8 because those IPs don't even belong to the same ISP (even though they both originate in Japan).

Lastly, one motivation behind showing the actual IP is that if a user keeps getting failed attempts from the same IP, they can report to CUs so they can conduct more investigation on that IP. Turning them into ranges may not be a great idea in that regard either.

Okay. In my opinion, the easiest way forward would be to show the IP if there is one attempt. If it's a bundled notification (i.e. multiple attempts), don't try to show IPs.

Change 419088 had a related patch set uploaded (by Huji; owner: Huji):
[mediawiki/extensions/LoginNotify@master] Show the IP address of the login attempt in the Echo notification

https://gerrit.wikimedia.org/r/419088

Everything I tried failed and my code was messy, so I created a bare-bones patch at https://gerrit.wikimedia.org/r/#/c/419088/3/ and here I explain what is wrong with it:

  1. It should not use $wgRequest->getIP() because that would be the IP of the user who is seeing the notification (target user), not the IP address from which the failed attempt actually originated from. I am using it in the code just to demonstrate what the output would look like.
    1. Later in PS6, I try to pass the actual IP, but even that is bad practice because it means we will be retaining IP address in yet another table, especially a table that is much harder to purge (because IP will be serialized among other parameters). Hence the need for the parameter to be a cuc_id.
    2. I have to figure out how to pass the right parameter. The parameter to pass would be a cuc_id value referencing the row in cu_changes that would contain the IP address.
  2. As stated above, the parameter should be a cuc_id. I had originally planned for the cuc_id to be returned by the onAuthManagerLoginAuthenticateAudit function in CheckUser but @MaxSem talked me out of it in https://gerrit.wikimedia.org/r/#/c/408049/
    1. We need to determine why MaxSem did not agree with the callback function returning a value.
    2. Obviously, the code in my new patch should be expanded such that if no cuc_id is returned or if the returned ID is not found in cu_changes (for example because CU data was purged), the notification just doesn't show the message body.

With all those caveats in mind, this is how it will look like if you have chosen to be shown one notification per failed login attempt:

Capture.PNG (176×524 px, 9 KB)

For batched notifications, I still have work to do. We can simplify it and only show the IP for the last attempt in a batch or something like that. But let's first figure out how to show the right IP in the right place.

I recreated the path I took last time (and eventually failed at) in PS6 of the patch above. It doesn't look clean and actually doesn't show the IP either, but it is trying to pass the IP all the way through the wiring of LoginNotify. @Reedy, @MaxSem can I ask you to take a look please?

Alright, I finally figured out how to handle the passing of parameters. Right now, what is passed is the IP address itself (which as we agreed above is bad).

As soon as I hear back from @MaxSem here about the cons of pros of the callback function in CheckUser returning a row ID, I will work on using that (instead of the IP) here.

I think I know why @MaxSem recommended not returning a value from CheckUserHooks::onAuthManagerLoginAuthenticateAudit(). Both that and LoginNotify\Hooks::onAuthManagerLoginAuthenticateAudit() use the AuthManagerLoginAuthenticateAudit hook; therefore, not only we cannot guarantee that the CU callback function is run before the LN callback function, but also, there is no way to pass the output of the CU callback function (may it be a cu_id) to the LN function in order to use it for generating the Echo notification.

The patch I submitted is now fully functional (shows the IP address of the login attempt correctly); however, in its current form, it ends up storing the user's IP in the echo_event table as a serialized value in the event_extra column, like this example: a:3:{s:11:"notifyAgent";b:1;s:2:"ip";s:7:"100.200.103.206";s:5:"count";i:79;} (where 100.200.103.206 was the IP address).

This is problematic because there is no way to purge just the IP from this table. The only alternatives that come to my mind are:

  1. To write a purge script that completely removes the rows that have event_type of "login-fail-new".
  2. To add a new column to that table called event_extra_private which would only contain private information, and then purge that column.

Approach 2 is what AbuseFilter does (it has an afl_ip field which stores the IP address associated with an abuse log entry, and is purged regularly). I think it is best to have a serialized field and not a plain field like the case of AbuseFilter. The reason is we are envisioning a future in which we would want to show more private data than just the IP (hence T174553). So it is best to just have one field in which we store the IP, the GeoNames ID for the city/country, the ISP subnet, etc. and have a purge script that just does that.

I am going to stop here. @Niharika I know that you are eager for this work to be completed; so am I. But I think we are now facing an important data architecture question which is best answered by a more seasoned MediaWiki developer. To the best of my knowledge, the original plan of just storing a cu_id in the Echo tables is impossible, but I could be wrong. If I am not wrong, then we have to make a choice between 1 & 2, and if we choose 2, then we need to patch Echo (which I am happy to take a stab at, but I would prefer to be done by someone more familiar with Echo as well).

@Huji: Echo notifications are meant to be transient, not permanent, and I think with LN notifications especially, there is no reason they should be retained indefinitely. It seems like the easiest solution would be to purge all LN notifications after 90 days (the entire notification, not just the event_extra column), possibly with the same clean-up mechanism that purges all notifications after 2000 (per user).

I'm happy to quickly modify my patch to put the IP in event_extra as soon as a 90-day purging script is made for Echo and enabled on WMF.

This task would probably need to be added to includes/jobs/NotificationDeleteJob.php.

@kaldari no. That job is only run when a notification is sent. Even though you might think it is safe to assume that there are so many notifications that the job will be called many times a minute, that is only correct for big projects (such as English Wikipedia). Smalller WMF projects (such as newly created wikis, the Ombudsmen Wiki, etc.) may not have even a single notification for several weeks, and this makes is theoretically possible for us to retain data beyond the retention period.

Therefore, we should do it the right way, which is create a new maintenance script in Notifications similar to https://phabricator.wikimedia.org/diffusion/ECHU/browse/master/maintenance/purgeOldData.php and schedule it to be run a regular basis (e.g. daily).

if we schedule and require something to run daily, we should probably have an internal error/whistleblower (possibly in NotificationDeleteJob.php) when we detect that such a cleanup script is NOT running.

Hmm, can we just pass cu_changes.cuc_id and load from Checkuser data, thus keeping all the private information in one place?

@MaxSem please see T174388#4048541 in which I explained why that is not possible. Can I ask you to confirm my analysis is correct?

I don't think there would be major privacy concerns

This assumes that all login attempts to the wrong username are malicious. I imagine a lot can be attributed to typos.

This could also be used to extract target users' IP addresses by deliberately registering multiple accounts which are common typos of a target usernames. The chance of success for single user might not be that high, but used against a large list of users it could be successful.

I don't think there would be major privacy concerns

This assumes that all login attempts to the wrong username are malicious. I imagine a lot can be attributed to typos.

This could also be used to extract target users' IP addresses by deliberately registering multiple accounts which are common typos of a target usernames. The chance of success for single user might not be that high, but used against a large list of users it could be successful.

But we already have a safeguard for that (through AntiSpoof): we don't allow one to create accounts with usernames too similar to an existing account.

AntiSpoof isn't foolproof though, e.g. it disallows Тhryduulf (the first letter is Cyrillic) but probably not Thryduuulf (too many 'u's) or Awkwrad42 (typo for my alt Awkward42).

That is fair.

I wonder how other service providers (such as Facebook or Google) approach this, as we know they have had a similar feature for years.

I would definitely support this. We see it everywhere else, on edits, password resets and so on. So login attempts would be a natural progression. What about 2 Factor authentication for all Wikipedia users also?

I would definitely support this. We see it everywhere else, on edits, password resets and so on. So login attempts would be a natural progression. What about 2 Factor authentication for all Wikipedia users also?

Unfortunately there are scalability issues we need to iron out before we can do something like this. (e.g. when someone forgets their 2FA codes or loses their phone, there is no easy way to get them back without asking a developer to alter the database.)

I'm personally of the opinion that we should have 2FA for all users, but at present there is no failsafe if things go wrong with it.

@Huji any updates? Are you still working on this?

Huji removed Huji as the assignee of this task.Nov 23 2020, 1:05 PM

The patch is still relevant. But I am going to unassign myself.

This is still a major issue that needs attention.

@Piotrus: Please feel free to improve the proposed patch if you'd like to see progress. Thanks.

@Piotrus: Please feel free to improve the proposed patch if you'd like to see progress. Thanks.

I am all for BOLD I am not a coder. I write Wikipedia articles and I expect coders to make the software so I can continue my work.

@Piotrus: See https://www.mediawiki.org/wiki/Bug_management/Development_prioritization for some background - basically, there is no box full of coders with too much time who could fix all and any incoming bug reports, unfortunately.