CirrusSearch is ignoring unicode superscript numbers such as the ³ character.
For example, although it does treat mm3 as two terms, and "mm3" as one term, as it should, it fails in every way for "mm³", where ³ is unicode.
The ways tried failed because they ignored the unicode superscript number:
- insource:/mm³/ regex ignore the unicode character
- "mm³" exact phrase ignores the unicode character
- "mm3"exact phrase ignores the unicode character
T41501 was concerned with normalizing the various forms of quotation marks to the ASCII quotes on our keyboards.
Similarly this task is concerned with two bugs.
# For indexed, "exact phrase", searches, unicode superscript numbers should be allowed in search results. (They're used on 100s of pages on Wikipedia.)
# ElasticSearch [[ https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html?q=regexp%20q| regex are supposed to find exact strings with unicode chars.]]