Page MenuHomePhabricator

Prefix functionality fails when first character is underscore, hyphen, single quote or double quotes
Open, LowestPublic

Description

The search Prefix:- or prefix:_ or prefix:' all produce the same results, a set of all titles that start with - or '.
For example the search Prefix:_F on Wikipedia:
https://en.wikipedia.org/w/index.php?title=Special:Search&profile=all&search=prefix:_F&fulltext=Search

It's also impossible to find titles that begin with double quotes: prefix:".
(Yet I can ''create such a page, and they do exist.)

All the other characters work fine.

Event Timeline

Cpiral raised the priority of this task from to Needs Triage.
Cpiral updated the task description. (Show Details)
Cpiral added a project: CirrusSearch.
Cpiral subscribed.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
TTO set Security to None.

Wikipedia has a pagename called ' in the template namespace: template:'.
It also has redirects titled ", ', and -.

If two "exact phrase" searches are run side-by-side without the AND given explicitly,
then any page or redirect that has a title that is a single character that is any form of quote or dash (double quote, single quote, hyphen, minus, mdash, ndash) is an indexed search result and shows up in CirrusSearch.

Either these characters should not be indexed as a word, or the title should not be allowed. Well, the titles are allowed. The quote or dash forms are not part of the string of characters of any indexed word. For example
word1"word2 or word3-word4 or 5word'6word are all two words.
So what are the dash and quote forms doing indexed as words by themselves?
Are the title index and word index the same?

Examples that match nothing except the same 14 unwanted pages:

Where not obvious, the shortcuts and redirects that link to them have the single-character title that is a form of quote or dash.

Then where the AND is not found, the quote is treated like a search term "word", and it matches that redirect, and the unwanted results happen.

Deskana renamed this task from Prefix fails when first character is underscore, hyphen, single quote or double quotes to Prefix functionality fails when first character is underscore, hyphen, single quote or double quotes.Dec 30 2015, 10:02 PM
Deskana moved this task from Inbox to Advanced functionality and syntax on the CirrusSearch board.
Deskana moved this task from Needs triage to Search on the Discovery-ARCHIVED board.
Deskana subscribed.