Symbols (e.g. Ⅸ Roman Numeral Nine U+2168 and ℃ Degree Celsius U+2103) can be written in the text and in the title of a Wikipedia article. But the Å Angstrom Sign U+212B in the text and in the title of a Wikipedia article is replaced with a Å Latin Capital Letter a with Ring Above U+00C5. Why is this symbol replaced while others are not?
Hi @Sunpriat2, thanks for taking the time to report this! Please always follow https://www.mediawiki.org/wiki/How_to_report_a_bug and provide
- a clear and complete list of exact steps to reproduce the situation, step by step, so that nobody needs to guess or interpret how you performed each step,
- what happens after performing these steps to reproduce,
- what you expected to happen instead,
- a full link to a web address where the issue can be seen,
- the web browser(s) and web browser version(s) that you tested.
Copying the two letters from a character map and pasting them (Phabricator; using Firefox 79) Å and Å, the same thing seems to happen.
What makes you think that this is a bug in some MediaWiki code itself?
I can create an article about the celsius symbol and make it a redirect https://en.wikipedia.org/w/index.php?title=%E2%84%83&redirect=no . I expected I could do the same for the angstrom symbol, but an article with a letter is created instead.
I can copy the symbol and letter into the wiki text editor (monospace Courier New) and see the difference, but if I press preview or save the page then the symbol is replaced with a letter. The symbol is shown only when using the code Å Å
℃ Degree Celsius U+2103 = UTF-8 E2 84 83
Å Angstrom Sign U+212B = UTF-8 E2 84 AB
I tried to write the title of the article in percentage encoding https://en.wikipedia.org/w/index.php?title=%E2%84%AB&redirect=no , but Wikipedia again shows the article about the letter, although nothing should change in the encoding from the browser side.
If autocorrect exists in order not to be confused with a letter, then there should be a way to create an article with a title through the encoding. On https://www.mediawiki.org/w/index.php?search=angstrom there is nothing about a replacement or about a symbol in the search. And this symbol cannot be used in the search - the results will be for a letter. Somewhere this behavior must be explained.
I can copy the symbol and letter into the wiki text editor (monospace Courier New) and see the difference, but if I press preview or save the page then the symbol is replaced with a letter.
Does the same happen in this very Phabricator task when you enter the two characters and check the preview below the comment field here?
In web requests, all input is normalized to NFC (Normalization Form Canonical Composition) form: https://gerrit.wikimedia.org/g/mediawiki/core/+/5123b8387174a685009a64cb2fcb24f1df37f2cc/includes/WebRequest.php#461
Closed as it is not a bug.