Page MenuHomePhabricator

Searching for a file name by name can fail due to handling of the extension
Open, MediumPublic

Description

Sorry I'm a bit pissed off from this worse search-engine (I mean the first web search-engine ever was better, I have some reasons more).
Example: Test wiki Admin logo.svg (File only) get no correct result, but this is the exact file name if you download (with Firefox). There is only one word switched (the exact name is in the SVG title inside, so it is displayed also on the file-description, but this should also not matter).

The file is : File:Test wiki logo Admin.svg

Event Timeline

Perhelion renamed this task from CirrusSearch: can't find files with only a scrambled wort (and/or exact exsiting title inside) to CirrusSearch: can't find files with only a scrambled word (and/or exact exsiting title inside).Jun 19 2016, 1:18 AM
Perhelion added a project: CirrusSearch.
Perhelion updated the task description. (Show Details)
Perhelion renamed this task from CirrusSearch: can't find files with only a scrambled word (and/or exact exsiting title inside) to CirrusSearch: can't find files with only a scrambled word (and/or exact title inside/metadata).Jun 19 2016, 1:21 AM

Thanks for reporting this.
Please provide exact steps to reproduce, step by step.
Plus I don't know what "scrambled" means in this context. :(

Technical note: Searching Test wiki Admin logo svg without the dot will find the file, note that the file name to find is Test wiki logo Admin.svg. This is probably due to the QueryString option auto_generate_phrase_queries=true which will convert logo.svg to a phrase search "logo svg" resulting in the internal query:
test AND wiki AND admin AND "logo svg"

Perhelion renamed this task from CirrusSearch: can't find files with only a scrambled word (and/or exact title inside/metadata) to CirrusSearch: can't find files with only a interchanged word (and/or exact title inside/metadata).EditedJun 20 2016, 6:48 PM
Perhelion updated the task description. (Show Details)

"scrambled"... :(

Ups* I don't know how this word could happen. I edited the description. Thanks too.

I added also the <title>Test wiki Admin logo</title> again (I removed this inadvertently from the SVG code on the last update. Now the title is displayed on the Metadata section again)

debt triaged this task as Medium priority.Jul 20 2016, 4:09 PM
debt moved this task from needs triage to This Quarter on the Discovery-Search board.
debt changed the task status from Open to Stalled.Sep 20 2016, 5:48 PM
debt subscribed.

This will be mostly fixed with BM25 - we'll recheck this when BM25 goes live.

Smalyshev changed the task status from Stalled to Open.Dec 13 2016, 6:08 PM

Sadly we added a code to fallback to the old QueryString approach when a dot is detected in the middle of a word.
The reason was to support search for acronyms that desperately needs this QueryString feature. The drawback is that logo.svg in this query was mistakenly treated as an acronym...
I see no quick workarounds except tackling more intelligently all the usecases where the QueryString auto_generate_phrase_queries=true option is needed.

EBernhardson renamed this task from CirrusSearch: can't find files with only a interchanged word (and/or exact title inside/metadata) to Searching for a file name by name can fail due to handling of the extension.Sep 13 2018, 10:37 PM