Page MenuHomePhabricator

Search doesn't handle special characters correctly
Closed, ResolvedPublic

Description

Author: wikipedia

Description:
I went to the page about the schwa (http://en.wikipedia.org/wiki/Schwa) and copied a
schwa (ə) and then copied the schwa into the search and searched for it. The search
indicated that I had searched for "ə" and not a schwa at all, and also didn't find
the page about the schwa.


Version: unspecified
Severity: normal
OS: Windows XP
Platform: PC
URL: http://en.wikipedia.org/wiki/Special:Search

Details

Reference
bz1036

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:06 PM
bzimport added a project: MediaWiki-Search.
bzimport set Reference to bz1036.
bzimport added a subscriber: Unknown Object (MLST).

The problem is that the search engine has no support for HTML entities. The search engine
should probably convert from the native encoding to UTF-8, and also normalize all entities to
UTF-8 also for indexing and searching purposes.

gangleri wrote:

related to character encoding see also

bug 2896: Search doesn't find (accent problem? indexing problem?)

Wiki.Melancholie wrote:

Searching for special characters like "ə" does work now!
Thus, I close this bug.

gangleri wrote:

Hallo!

Please see bug 4601 comment 2
Bug 4601: Exact search for titles using php meta / escape characters

Are these bug dependent? (I can not see the character specified above with
this PC because of missing fonts.)
Are "OS" and "Hardware" set properly?

best regards reinhardt [[user:gangleri]]