I kind of think maybe we should just go with random letters. I don't think the combining two words thing helps users very much since usually they are weird enough words its not identifyable as a word. But it does probably help attackers quite a bit.
Hmm, http://www.123seminarsonly.com/Seminar-Reports/008/47584359-captcha.pdf has some advice about eliminating characters that look alike (e.g. 1 and l)
I have no idea how long. I would feel weird about anything less than 6, but beyond that, it feels like picking numbers out of a hat. I guess 10 works. Maybe 8? I don't really know.
I don't know about math, but I do know image classifying has been discussed before. ConfirmEdit has the ability to do math captchas already, BTW.
We've even discussed stuff like using machine learning of mouse click timing and such (T158909) or reCAPTCHA-like micro edits (T34695). There are plenty of other tasks in ConfirmEdit (CAPTCHA extension) that you might look through too.
I suspect a math captcha is probably about the same as having an "I am not a bot" checkbox (a regular one, not the fancy Google kind), computers are good at math and parsing the problem isn't likely to be hard. Not that our current captcha is all that great, T141490 says it can be read by off-the-shelf OCR software from 2014.
Image classifying captchas need a large corpus of classified images. Since MediaWiki is open source, an attacker could probably just download our corpus.