Page MenuHomePhabricator

Use Pwned Passwords API to check password strength
Closed, DeclinedPublic

Description

The Pwned Passwords API (run by Troy Hunt, same person who does haveibeenpwned.com) contains about half billion passwords (pretty much every breached pasword that ever became known) and provides anonymous password checks. Length writeup here, the short version is that you hash the password and send the first few bytes, and the server returns all password hashes which start with that prefix (a few hundred).

In theory this can be abused by a mailicious service operator: it can have a public and a secret list of common passwords, and only return the public ones for the given prefix. If the user has a very common password it would be on the list and we force them to change it so no harm done; if it's a unique password, sending the first few characters of the hash doesn't help a potential attacker; if it's slightly uncommon but known password (on the secret list but not the private list) then the service operator knows that one of few hundred passwords that start with that hash is in use on some Wikimedia site (and can probably correlate the user account from timing).

So using it is not without risk, but given the service is run by a generally highly regarded security researcher, the risk is probably still smaller than allowing for known-breached passwords because we don't have the capacity to have more than a few hundred thousands items on our bad password list, so maybe it is worth the risk?

Event Timeline

Tgr created this task.Mar 4 2018, 5:30 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 4 2018, 5:30 PM
Reedy added a subscriber: Reedy.Mar 4 2018, 7:36 PM

So using it is not without risk, but given the service is run by a generally highly regarded security researcher, the risk is probably still smaller than allowing for known-breached passwords because we don't have the capacity to have more than a few hundred thousands items on our bad password list, so maybe it is worth the risk?

It'd certainly be possible to do it locally, but we'd need to throw some proper db resources at it, plus future updates/maintenance etc.

We'd certainly need to loop in legal before doing that... And work out about what wgCopyUploadProxy does (logging etc) too

revi added a subscriber: revi.Mar 4 2018, 7:39 PM
Tgr closed this task as Declined.Mar 14 2018, 3:35 AM

Thinking/reading some more about it I feel this was a really stupid idea. A site that sets security expectations as high as Wikipedia should just never divulge password information, period. (Even partial information.) Also probably creates all kinds of legal liability.

Using it locally would still be nice; filed that as T189641: Service for checking the Pwned Passwords database.

Tgr added a subscriber: Volans.Dec 1 2018, 12:01 AM

(bringing over a related thread from elsewhere)

Why "k-anonimity offers very little defense"?

It only means the attacker has to try k passwords instead of one. With k being a few hundred, that only makes a realistic difference if you use super strict rules for bad logins (e.g. lock out the user completely after some amount of tries, like banks do). I guess a wiki could use very aggressive bad password throttling specifically for login attempts with weak passwords (a few per day or month) and in that case it would make a difference, but not with the default setup.

But you're sending to this API only the first 5 char of the SHA1, that returns ~500 SHA1 hashes that match that prefix and the client checks if the hash of the password under test is part of this list or not. So the malicious service doesn't know if the check was positive or negative.

If the password was known to the service, k-anonimity only means that it needs to do a few extra attempts (hundreds, or even tens of thousands of login attempts are easy to pull of against almost any website if the attacker draws them out a bit). If the password was strong and unique then the full hash would be equally useless to the attacker so k-anonimity makes no difference. For a mediocre password that has not been leaked yet but can be bruteforced, it does make a difference, so in that regard I did undersell it.

The malicious service at most knows that you tested a password of which the first 5 char of its SHA1 are known. It could, of course, work under the assumption that all the tested passwords are positive (but it doesn't know it), and start trying all the ~500 passwords on the website that made the check, but for which username? It doesn't have any info about those, so it should try them all.

MediaWiki keeps a public log of account creation, so for a public wiki it's trivial to correlate. Even if it's English Wikipedia there are only about five account creations per minute, plus a few hundred logins, so with the timestamp you could correlate every account creation with the 10-20 hash prefixes sent in that second. On a less huge wiki the service could reliably pinpoint the account name.

On smaller wikis, some of the logins are not impossible to deanonimize either, by looking for users who had a long period of inactivity followed by activity shortly after the password check.

On smallish wikis just going through the full user list is an option as well.

In any case k-anonimity does not change anything about this - the attacker just has multiple passwords to try instead of one.

Moreover, on account creation/change password if the password is in the list it should be forbidden to use it, so not risk here.

That only helps if the service is honest. It can omit some of the bad passwords known to it from the response (which will also narrow down the list of passwords to try, if the assumption is that the wiki prevents the use of passwords that are returned by the API).

Volans added a comment.Dec 3 2018, 3:16 PM

Note to self: do not reply to a complex topic on a Friday night
True, if the usernames and user activities are public there are lots of information available for a malicious HIBP site.