On Wikipedia, km² is impossible to find, as is mm³.
On Goggle [[//www.google.com/search?q="mm³"+site:en.wikipedia.org | "mm³" ]] gives 71 results, finding mm³.For unicode, **regex can find one-character or two-character strings only.**
On Wikipedia [[//en.wikipedia.org/w/index.php?title=Special:Search&search="mm3"&ns0=1&fulltext=Search|"mm3"]] gives 140 results, finding mm3 and mm<sup>3.
T41501 says unicode quotes are not normalizedTo see this without running bare regex on millions of pages,
and this one says ² and ³ are not normalized.[Here's 10k pp with 250 unicode hits. Add your own chars 'til it fails.](https://en.wikipedia.org/w/index.php?title=Special:Search&profile=default&search=insource:/²|³/+prefix:Che&fulltext=Search)
But digits are indexed tokens and quotes are notOn Goggle [[//www.google.com/search?q="mm³"+site:en.wikipedia.org | "mm³" ]] gives 71 results, finding mm³.
Regex can find one-character or two-character strings only.T41501 says unicode quotes are not normalized,
To seeand this without running a bone says ² and ³ are regex on millions of pages,not normalized.
[[https://en.wikipedia.org/w/index.php?title=Special:Search&profile=default&search=insource:/²!³/+prefix:Che&fulltext=Search | here's 10k with 250 hits]]. (Change ! to | and add your own characters.)But //digits are indexed// and quotes are not.
T95849 considers analyzers, filtering, and fields, and shows enwiki
page mapping properties while troubleshooting the unicode ★ character.
The current analyzer for the //match highlighter// works correctly "finding"
unicode characters in all manner of strings. For example,
to see the highlighter "finding" km², something else on the page must match:[see `insource:/²|³|km²/ prefix:Che`](https://en.wikipedia.org/w/index.php?title=Special:Search&profile=default&search=insource:/²|³|km²/+prefix:Che&fulltext=Search). Km² is "found" by the **highlighter**,
[[https://en.wikipedia.org/w/index.php?title=Special:Search&profile=default&search=insource:/²!³!km²/+prefix:Che&fulltext=Search| see insource:/²!³!km²/ prefix:Che]]. (Change ! to |.) See the "found" km² (and try other strings).
Bbut when you remove the //actual// matches (single unicode strings) `²|³`... nothing.
The current analyzer for the type-ahead search also works
with strings greater than one or two, for example ♥ or ★, or m or mm with ² or ³.
Summary:
- `"mm3"` or `"km2"` find no normalized ² or ³ character in the index.
- `"mm³"` or `"m²"` find only mm or m (because these digits are treated as punctuation?)
- `insource:/mm³/` or `insource:/km²/` find nothing because they're greater than two chars.