Page MenuHomePhabricator

support OR between intitle: operators
Closed, DuplicatePublicPRODUCTION ERROR

Description

Support queries like "incategory:Felis_silvestris_catus (intitle:Birds OR intitle:Dogs)" (Anything in the cat category that has either Birds or Dogs in the title)

Or more generally things like intitle:ogv OR intitle:ogg OR intitle:webm to do a pseudo search of only videos

(From merged task, also causes log-errors of the "Search backend error during full_text search" type)

Event Timeline

Bawolff raised the priority of this task from to Needs Triage.
Bawolff updated the task description. (Show Details)
Bawolff added a project: CirrusSearch.
Bawolff added subscribers: Bawolff, Manybubbles.

I started on a project to move query parsing over to the Java/Elasticsearch side with an ANTLR grammar which supports this sort of thing. PHP parser generators weren't doing it for me and I figured it'd be a more useful in general if it was embedded in Elasticsearch. And having access to the analyzers during query parsing doesn't hurt either. Anyway, I've added this as blocked by the tracking bug around that.

Its not high on product's list of initiatives and its a pretty big task but we'll see. Power users asking for things certainly influences product priorities.

Thanks, @Manybubbles, for your comment! I’d be interested to see what you’ve done so far. As far as I understand, moving the query parsing to the Elasticsearch side would involve rewriting the code in the searchText and search methods of the Searcher class as an Elasticsearch plugin in Java/Groovy. That sounds indeed like very big task.

I’d like to propose an alternative path that could be taken in several steps, each one benefiting the existing code base:

  1. Write unit tests for the searchText and search methods.
  2. Refactor the methods, breaking them down into smaller methods or putting them into separate classes.
  3. Write a basic “boolean query analyzer” that partitions the query into parts where “OR” and scoping brackets occur (ignoring quoted brackets and ORs). The analyzer creates a bool query for each part. It basically does the replacements/modifications that searchText does for each query part.
  4. Refactor with a PHP-based parser building library into a real query parser. Alternatively, use the existing PHP classes and tests to build the Elasticsearch plugin.

Given that the task that blocks this is declined, should this be declined as well, or is there some other way of handling this?

Deskana moved this task from Inbox to Advanced functionality and syntax on the CirrusSearch board.
Deskana moved this task from Needs triage to Search on the Discovery-ARCHIVED board.
mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:12 PM