Page MenuHomePhabricator

Extract a set of full_text "ambiguous" queries from hive
Closed, ResolvedPublic


We need to extract a set of ambiguous queries (that return more than 1000 results for enwiki)
Ideally we need :

  • a set with basic queries (no special syntax, no phrase search)
  • a set with single word queries
  • a set with multi word queries

We should carefully exclude queries from the WikipediaApp since they include partial words (search as you type) which will pollute the set.

Event Timeline

dcausse created this task.Feb 4 2016, 2:55 PM
dcausse raised the priority of this task from to Normal.
dcausse updated the task description. (Show Details)
dcausse added subscribers: Aklapper, StudiesWorld, dcausse.
dcausse claimed this task.Feb 4 2016, 5:55 PM
dcausse set Security to None.

Change 268704 had a related patch set uploaded (by DCausse):
hive query to extract sample query set

Queries are available on stat1002.eqiad.wmnet:~dcausse/query_sets/

Change 268704 merged by jenkins-bot:
hive query to extract sample query set

EBernhardson moved this task from Needs triage to Search on the Discovery board.Feb 11 2016, 11:20 PM
Deskana closed this task as Resolved.Feb 17 2016, 5:18 PM
Deskana added a subscriber: Deskana.