Wikimedia's captchas are fundamentally broken: they keep users away but allow robots in. While they can filter out the most stupid spambots, they are easily breakable with off-the-shelf tools. (T141490) At the same time, they take significant effort and often multiple tries for a human to solve ([[https://meta.wikimedia.org/wiki/Research:Account_creation_UX/CAPTCHA|research]]), and are especially bad for people with visual impairments (T6845) and those who don't speak English or don't even use Latin script (T7309). Our captcha stats (T152219) show a failure rate of around 30% (and that does not count users who don't even submit the form; there is about one captcha submission per hundred captcha displays, but we don't know to what extent that's crawlers/spambots).
AI could help to build something like [[https://www.google.com/recaptcha/intro/|reCAPTCHA]] (that does not violate our privacy policy): a two-tier system where users are given a trivial test (click the button - could even be integrated into clicking the usual button), the system collects as much information (timing, mouse movements, browser details etc) as possible and makes a judgement; suspicious users are given a harder test (which could just be a regular captcha, but if we can generate questions based on image recognition or other hard-for-robots-easy-for-humans tasks, even better). Maybe even make the first test invisible, like Google does with [[https://developers.google.com/recaptcha/docs/invisible|invisible reCAPTCHA]] (where the easy test is basically just clicking the registration button).
== See also
* the [[https://www.mediawiki.org/wiki/Outreachy/Round_15|Outreachy 15]] project where initial work for this task was done: {T178463}
* [[https://meta.wikimedia.org/wiki/Research:Spambot_detection_via_registration_page_behavior|research page]]