Page MenuHomePhabricator

Investigate proof-of-work captchas for Wikimedia sites
Open, MediumPublic

Description

Per T241921: Fix Wikimedia captchas, Wikimedia captchas are badly broken and need fixing. @TheDJ pointed out that there is an opensource captcha called mCaptcha that does not require user interaction (it's based on proof-of-work, ie. making the browser do a lot of work to make registration uneconomical for spambots), which would fix at least some of our captcha woes. There are also some other PoW capctha effort such as forest/pow-captcha and xenohunter/lapti-pow-captcha. We should look into whether one of those is an acceptable captcha replacement.


Some questions to consider:

  • mCapctha is AGPL, is that an acceptable license for us? (See T76158: Pitfalls checklist for software using AGPL for past discussions on the issue.) pow-captcha and lapti-pow-captcha are GPLv3 and MIT respectively so no issue there.
  • are these capcthas actually accessible? From the user POW, you only need to check a checkbox, but will common screen reader software handle the JS code required for proof of work? (mCaptcha uses WASM and provides a polyfill; pow-captcha uses WASM and Web Workers; lapti-pow-captcha is largely undocumented but seems to use Web Workers) Will they cause problems on cheap devices with slow CPU? Will they make the browser unresponsible? Is the slowdown in signups that they cause a problem?
  • we'd probably want to self-host for privacy reasons (mCaptcha has a SaaS service, for the other two it's not even an option), are there any operations concerns? mCaptcha is rust/Postgres/Redis and comes with a Docker container; pow-captcha is in go; lapti-pow-captcha is node.js (and largely undocumented but also very simple)
  • are they easy to fit into our login form in terms of UI style and i18n?
  • most importantly, do they actually stop spambots? Even if it is uneconomical for a spambot to do expensive proof of work tasks, it might not be clever enough to bail out, in which case all we added was a few seconds delay during spambot registration.

Event Timeline

Today i came across https://mcaptcha.org/

  • It’s a proof of work captcha system
    • No visual or auditory accessibility problems
    • Language agnostic
  • can be self hosted
  • open source (AGPL)
  • API compatible with reCAPTCHA and hCAPTCHA
  • The server is written in rust, using Postgres and Redis.
  • The client is webassembly and a JS polyfill.

Seems to tick lots of our boxes. I’m not very familiar with the downsides of PoW captchas, perhaps the compute cost won’t be high enough to keep ppl out? I’m also curious if this requires presenting the captcha to plain trusted users a more than other captcha systems.

I found a few more
https://git.sequentialread.com/forest/pow-captcha
https://github.com/xenohunter/lapti-pow-captcha

There r also some online comments, that up to 1 minute of work (by the client), often meaning; between first interaction of user with form and submit, before pow is more expensive to solve than normal captcha solving services. Considering a login with password manager might only take half a second, that significantly limits the usability of pow captchas I suspect….

Some research on pow effectiveness TLDR u would inconvenience a lot of ppl if u make it truly uneconomical for bad actors
https://www.cl.cam.ac.uk/~rnc1/proofwork.pdf

mcaptcha seems interesting in that it appears to be OSI-compliant with its license and self-hosted by default, so there shouldn't be any privacy concerns. The things that would concern me about it, or any of these, are how proven they are within large-scale production environments like Wikimedia and other major websites (not seeing any case studies or users?) and how good they are from an accessibility and i18n standpoint, which would be two critical requirements for anything we'd wish to replace the existing FancyCaptcha, which currently meets none of those requirements. I also don't see any integration with something like Privacy Pass or one its derivatives, which I think would be an important consideration as well.

The general problem with PoW captchas/hashcash, is it is very hard to set an appropriate threshold.

On one hand - for real users you have the requirements:

  • Not an undue burden for people on low-end equipment
  • Latency sensitive for real users (they want to edit now, not 15 minutes from now or even 30 seconds from now)

OTOH, for malicious users:

  • They may have access to high end equipment (AWS servers. Depending on hash chosen, GPUs may be relevant, although if the chosen hash is appropriate they shouldn't be).
  • They might not be editing in real time, so they may not care about latency. What does a spammer care if it takes 15 minutes to make an edit?

It might work out if you're trying to solve a rate limiting/DoS problem where the spam edits are a million per minute and overwhelming everything, however we already have a solution for that in rate limits.

So the math usually doesn't work out. Either you have to set a threshold so high that it excessively hinders real users, or the threshold is so low that spammers don't even notice.

At best, if it works, i suspect the reason it would, would be that some spammers are not simulating a browser and might not have JS installed, although lots of even low effort spammers use a headless browser.

Which might actually be useful. Our existing captchas work because spammers mostly see them and give up. They are really really bad at being a word-based captcha, falling to unsophisticated approaches. But even for good word-based captchas - there are algorithms in the literature that i believe can solve even sophisticated word-based captchas, which is probably why almost all captchas are of the "find the streetlight" variety. So we don't have to be sure to be useful, just enough of a papercut to make spammers not try.

I would also add that mcaptcha is SHA256 based which seems like a questionable choice for this use case, as it is generally quite amenable to being sped up on specialized hardware (albeit perhaps assuming spammers would put that type of effort in is an unreasonable assumption).

If I was designing a PoW captcha system, I would probably look more towards something like argon2d as the hash function.

  • Latency sensitive for real users (they want to edit now, not 15 minutes from now or even 30 seconds from now)

If it's used as a fallback for the normal, inaccessible captcha (much like an audio captcha), it isn't really latency sensitive - slow is better than ever (or however long emailing a human account creator takes - months, I think, in practice). Although how to communicate progress of a slow operation to users with visual accessibility issues is a good question.

  • They might not be editing in real time, so they may not care about latency. What does a spammer care if it takes 15 minutes to make an edit?

I think a clever spambot would timeout after a few seconds to avoid tarpits - it's just not economical to take ten minutes per spam edit when you could make ten per second at a less protected site. Dumb spambots could be a problem though.

If I was designing a PoW captcha system, I would probably look more towards something like argon2d as the hash function.

pow-captcha uses scrypt which looks like a decent choice. (lapti-pow-captcha uses sha3.)

I agree that scrypt is a reasonable choice.

mmartorana changed the task status from Open to In Progress.Aug 11 2022, 10:03 AM
mmartorana triaged this task as Medium priority.
mmartorana added a project: Vuln-DoS.

See also T241921#8567496 (FriendlyCaptcha).

Reedy changed the task status from In Progress to Open.Oct 9 2024, 11:36 PM
Reedy subscribed.

Not that it particularly means anything either way... But as a stake in the ground just over 2 years after this task was filed:

https://github.com/sequentialread/pow-captcha hasn't had any activity since September 2021

https://github.com/hackermondev/lapti-pow-captcha since June 2022

https://github.com/mCaptcha/mCaptcha is April 2024. And based on https://github.com/mCaptcha/glue seems to suggest to rely on iframes, which depending on actual implementation is more (security) fun and games...!