Page MenuHomePhabricator

deactivate Unicode normalization via <foobar>bla</foobar>
Closed, DeclinedPublic

Description

Author: gangleri

Description:
Dear friends,

In order to document systems which do not use / did not use Unicode normalization MediaWiki should provide a way do deactivate Unicode normalization.

Please read the discussion at http://www.mediawiki.org/w/index.php?curid=12643#Examples_when_normalization_should_be_performed_and_when_it_should_not .

Best regards Reinhardt [[user:Gangleri]]


Version: unspecified
Severity: enhancement
URL: http://www.mediawiki.org/w/index.php?curid=12643#Examples_when_normalization_should_be_performed_and_when_it_should_not

Details

Reference
bz22031

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:53 PM
bzimport set Reference to bz22031.
bzimport added a subscriber: Unknown Object (MLST).

conrad.irwin wrote:

You can already include un-normalised text using HTML entities, see [[wikt:Appendix:Unicode normalization]], or as you did with manual URL encoding.

I reckon most people who know why such things are broken will be able to work out the escapes, though a utility to help them wouldn't be bad. For any given text, it requires detailed knowledge of unicode and the context to identify the few cases where normalization is destructive.

Unicode normalization is done on input, so it's hard to imagine the tag method you propose working safely and efficiently.

ayg wrote:

Should we avoid Unicode normalization in URLs altogether? It seems as though it's likely to cause this sort of problem.