Change Details

On Wikipedia, km² is impossible to find, as is mm³. On Goggle [[//www.google.com/search?q="mm³"+site:en.wikipedia.org | "mm³" ]] gives 12071 results, finding mm³ and some m<sup>3. On Wikipedia [[//en.wikipedia.org/w/index.php?title=Special:Search&search="mm3"&ns0=1&fulltext=Search|"mm3"]] gives 140 results, finding mm3 and mm<sup>3. T41501 says unicode quotes are not normalized, and this one says ² and ³ are not normalized. But digits are indexed tokens and quotes are not. Regex can find one-character or two-character strings only. To see this without running a bare regex on millions of pages, [[https://en.wikipedia.org/w/index.php?title=Special:Search&profile=default&search=insource:/²!³/+prefix:Che&fulltext=Search | here's 10k with 250 hits]]. (Change ! to | and add your own characters.) T95849 considers analyzers, filtering, and fields, and shows enwiki page mapping properties while troubleshooting the unicode ★ character. The current analyzer for the //match highlighter// works correctly with unicode,"finding" finding regexpunicode characters in all manner of strings. For example, Tto see the highlighter working for"finding" km², something else on the page must match,: [[https://en.wikipedia.org/w/index.php?title=Special:Search&profile=default&search=insource:/²!³!km²/+prefix:Che&fulltext=Search| see insource:/²!³!km²/ prefix:Che]] (and c. (Change ! to |.) See the "found" km² (and try other strings). The current analyzer for the type-ahead search works,But when you remove the //actual// matches (single unicode strings) `²|³`... nothing. and The current analyzer for the type-ahead search also works with strings greater than one or two, for example ♥ or ★, for example ♥ or ★ or m or mm with ² or ³. Summary: - `"mm3"` or `"km2"` find no normalized ² or ³ character. - `"mm³"` or `"m²"` find only mm or m (because these digits are treated as punctuation?) - `insource:/mm³/` or `insource:/km²/` find nothing because they're greater than two chars.