Page MenuHomePhabricator

JS mw.Title does not normalize input to NFC form
Open, LowestPublic

Description

JS mw.Title does not normalize input to Unicode NFC form. All MediaWiki titles (and indeed all text) must be in the NFC form. This is enforced by WebRequest (before Title gets the input).

This might be practically impossible to solve. There is String.prototype.normalize(), but isn't not supported widely enough. Implementing Unicode normalization requires 100K+ of data, we probably shouldn't increase the size of mediawiki.Title by that much. (Example JS library that implements this is https://github.com/walling/unorm.)

Event Timeline

matmarex created this task.Aug 24 2016, 5:48 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 24 2016, 5:48 PM
matmarex triaged this task as Lowest priority.Aug 24 2016, 8:51 PM
Seb35 added a subscriber: Seb35.May 16 2019, 7:30 AM

According to Mozilla on the doc for normalize it is now supported by all browsers except IE (but supported by Edge). Hence it could be worth to solve this task to mitigate the global impact, even without polyfill given the constraints on size.

PS: funny, I recently saw a tweet linking to a document where the server expected a URL in NFD, but the link on the tweet was in NFC (either normlized by Twitter or during copy-paste), leading to a 404 from the tweet.