Page MenuHomePhabricator

Investigation: LoginNotify false positives on dewiki
Closed, ResolvedPublic3 Estimated Story Points

Description

There have been five reports in the last two weeks from users on German Wikipedia that they've received at least one "false positive" message from LoginNotify.

The purpose of this ticket is to figure out whether there's an unexpected problem, and if so, what we can do about it. There's one report on Meta, and then I found two discussions in German which I Google translated. Below is the info that I know.

#1. Ibn Battuta reported on Meta: "Other users and I have been getting loads of false positives: warnings that a different device was used for a log-in when in fact we were using our same computers all along. Maybe the IPs have changed, but certainly not the devices. So, first, the email notification text should be adapted to properly describe whatever it's supposed to be warning about."

I've responded to Ibn Battuta on Meta asking for more info.

#2. This discussion has two people who've received false positive messages: RookJameson and Alexpl.

RookJameson says (G-translated): I received an e-mail a few days ago, that someone of a new device with my user account has registered. Since I have not done this, I have carefully changed my password, even if no foreign edits were carried out or the like. Today I habitually inadvertently first my old password enter and then a Wikipedia internal message that a failed log-in attempt of a new device took place. At the same time, I also received two e-mails from a failed and a successful log-in of a new device reported. It seems to me, for some reason, my laptop, with which I have been editing here for years, has recently been recognized as "new".

Alexpl says (G-translated): It happened to me. Same computer, same software - three messages about supposedly "new device".

#3. This discussion has two people who've received false positive messages: WIr lagen vor Madagaskar and Benatrevqre.

Here's quotes from that discussion, Google-translated:

WIr lagen vor Madagaskar: Could someone stop these wikimedia mails to my email address? It's been the same PC for years... I have already tried cookies, it is not. Also my tracking blocker does not increase its counter when I log on. To any information on my PC, wikimedia does not seem to get any more.

Benatrevqre: New IP address and / or other browsers used?

WIr lagen vor Madagaskar: Googlemail does not seem to know my PC any more and therefore sends warnings. Google and wikimedia seem to use the same sniffing technology. But which?

Benatrevqre: I guess you have recently been assigned an IP from a new and completely different address range.

WIr lagen vor Madagaskar: Would be an explanation. If my network operator in my router exchanged the address of the DNS server entered therein against that of another DNS server, and wikimedia can query, which DNS server belongs to my dnamische IP, something like this could come out. They would not have to start a snooping script on my computer. But, nothing exact is not known.

Benatrevqre: With the DNS server has nothing to do. I am also with a Provider for a long time customer and could also notice the same phenomenon.

Event Timeline

I don't think there are any news to investigate here.
I get a "new device" notification every other day or so (but I don't login to the wiki everyday), since I enabled it as stated on T174263#3574597

I think one of the main problems of the extension is that it blindly checks against a /24 (a /64 on IPv6), regardless of the variance of the IP addresses. If it is common for him to log in from very different addresses, entering from another /24. Heh, sometimes I get assigned an IP address on a different /8 than the one I had before!

Let's check and make sure that notifications are only sent if BOTH the IP address has changed and there is no cookie-match.

kaldari set the point value for this task to 3.Nov 29 2017, 12:19 AM
kaldari triaged this task as Medium priority.Nov 29 2017, 12:34 AM

Just got this notification for my Tool Forge bot, which is strange.

Here's what the current behavior is like:

  • The login attempt is considered to be from a "known device" if any of these matches:
    • There is CheckUser data (/24 subnet) OR
    • There is a non-expired cookie indicating known device (cookies expire in 180 days) OR
    • The user's last used subnet in cache matches with current. Latest IP subnet (/24) used by a user is cached in memcache for up to 60 days.
  • If there is no CU data for the user at all, nor a cookie, it considers that device to be known, to be on the safe side.

I agree with what @Platonides stated. It seems that the cause of these notifications is that those accounts login from different IP addresses (you can see that the last discussion states that they got an email from Gmail too about a new device login). There are two things here we can look into -

  1. Looking at RDNS to detect IPs coming from same ISP (suggested by Max and Platonides on IRC).
  2. Making sure that LoginNotify::cacheLoginIP() works fine and considering improving it.

@Niharika: You say that the attempt is considered to be from a known device if "There is CheckUser data". I'm assuming you mean there is CheckUser data that matches the user's last used subnet. Is that right? Or is it just any CheckUser data at all?

The RDNS idea is interesting. You could do a gethostbyaddr() lookup and store the domain name in cache. gethostbyaddr() depends on a response from a DNS server though, so you would need to make it asynchonous, i.e. you wouldn't want to delay log-in until it completes. The next question would be, how much of the domain name do you require to match? The full domain name? The second-level domain (aka public suffix)? Keep in mind second-level domain is rather difficult to parse out reliably (see https://publicsuffix.org/).

@Niharika: You say that the attempt is considered to be from a known device if "There is CheckUser data". I'm assuming you mean there is CheckUser data that matches the user's last used subnet. Is that right? Or is it just any CheckUser data at all?

Yes. If user's current subnet matches up with any of the previously used subnets that are in the CU table. One thing to note here is that CU does not yet have date for logins. It does for edits/other actions that the account does. So if a user logs in and logs out without performing any actions - their IP subnet will not be logged in the CU table.

OK, it sounds like everything is working properly as far as we can tell. The only possible action item identified would be to start doing RDNS lookups against the IP addresses so that we could match to domain name of ISP. This would not be a trivial undertaking though and I think we should hold off on this unless the false positive issue appears to be a serious and sustained problem for the community.