Page MenuHomePhabricator

Disallow invisible unicode characters in filenames (and article names?)
Closed, DuplicatePublic

Event Timeline

Josve05a raised the priority of this task from to Needs Triage.
Josve05a updated the task description. (Show Details)
Josve05a subscribed.
polybuildr renamed this task from Dissalow invincible unicode characters in filenames (and article names?) to Disallow invisible unicode characters in filenames (and article names?).Jul 20 2015, 6:18 PM
polybuildr set Security to None.

That character, '‍ ', is the zero-width joiner, and disallowing it in page titles would affect at least 832 pages on various Wikipedias (http://quarry.wmflabs.org/query/4493, I checked only the few languages that I know use this character, there may be more).

MediaWiki does disallow some invisible characters (bidirectional overrides, apparently) and normalizes various whitespace.

The best solution would be to determine if the zwj would have an affect on the proceeding character, and disallow it in places where it does nothing, but that might be a lot of processing.

If we're willing to be english centric, we could dissallow it immediately following an ascii character (I think that'd be safe, but we'd need to double check).