The captcha software should generate captchas in languages other than English at
non-English projects, depending on the locale. I've seen some generated captchas
at the Vietnamese Wikipedia that would definitely confuse Vietnamese-speakers
(can't remember the words exactly), because of things like r's and n's smooshed
up right next to each other, so it looks like an m, except to an English user
who happens to know a word that has "rn" instead. The user might have to *guess*
because the English words really don't follow Vietnamese spelling rules. We've
recently had users complaining to the sysops of not being able to read captcha
images, presumably for this reason.
An advantage to localizing the captchas would be that it might reduce the impact
of spambots at non-English projects. As far as I know, there isn't yet a
captcha-defeating bot that understands Vietnamese or Basque or Quechua.
For now, I'm only proposing localizing for most languages that use the Latin
alphabet, because requiring users to respond to a captcha in Thai or Arabic
would exclude a lot of legitimate interwiki users. And users of other scripts
tend to have the means of entering in Latin-based characters. Also, for
languages that use diacritical marks, we should generate the words with or
without the marks (not sure which) and modify
[[MediaWiki:Captcha-createaccount]], asking the user to enter in the word
without diacritical marks of any kind.
Once Latin-based alphabets are out of the way, it'd be a good idea to localize
for other writing systems as well, but provide a Latin-based alternative, per
Neil Harris' suggestion .
These localized captcha strings should *not* be stored in the MediaWiki:
namespace, nor anywhere easily accessible to the public, because bot writers
could easily write language-aware bots using such information. For wordlists, we
could start by using open-source lexicons, such as OpenOffice.org's . We
should also contact embassadors of non-English projects, asking them for help
compiling sufficiently long lists of their own.