Related bugs: T5969: Unicode (UTF-8, utf8) compatibility (tracking); T16600, T7732; T4593, T3524 (regarding usernames)
A bug in the pywikipedia framework [1] showed up when editing interwikis. This caused bot wars [2], which have been fixed by removing the U+200B ZERO WIDTH SPACE from end of the page title where the problems happened [3].
Should this character be allowed in page titles? And, more specifically, at the end of a page title?
Characters from the range U+2000-U+200A are already treated as spaces (and replaced by underscores). Since Unicode 4.0, U+200B is no longer considered whitespace by the Unicode Consortium.
To cite Brion Vibber in T16600:
They're not technically illegal, but perhaps should be excluded as they
wouldn't be useful.
and in T3524:
*Invalid* characters (those that are illegal in XML or don't reliably cut and
paste) need to be outright blocked in titles.
Although U+200B ZERO WIDTH SPACE seems to cut-and-paste on windows, it's not something I'd call 'reliable' - selecting characters from the left, moving to the right, it's easy enough not to select the U+200B ZERO WIDTH SPACE at the end of the page title. As such, I think it's reasonable not to allow the character, or to replace it with an underscore.
[1] https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3182761&group_id=93107
[2] http://en.wikipedia.org/w/index.php?title=Podolsk&action=history
[3] http://bo.wikipedia.org/w/index.php?title=%E0%BD%94%E0%BD%BC%E0%BC%8B%E0%BD%91%E0%BD%BC%E0%BD%A3%E0%BC%8B%E0%BD%A6%E0%BD%B2%E0%BD%82&action=history
Version: unspecified
Severity: normal
See Also:
T16600: Illegal Unicode characters are allowed in pages
T3524: Usernames should use unicode whitelist
T4593: Non-printing characters allowed in registration
T7732: MediaWiki allows characters in the U+0080 to U+009F range
T44807: Invisible Unicode characters allowed on pagetitle (\u200E | \uFEFF | \u200B)
T57227: interwiki problems in km wikipedia
T57246: Problem with 0x200B ZERO WIDTH SPACE in page titles
T34717: Question: Bidi overrides and Unicode spaces removal from titles: why not zero-width space and horizontal tab?