Page MenuHomePhabricator

Store SearchContext::$syntaxUsed in the CirrusSearchRequestSet payload
Closed, ResolvedPublic


It could be interesting to learn more about the usage of the various search keywords.
I think this can easily be done by storing the state of SearchContext::$syntaxUsed in the CSRS payload.

Event Timeline

Restricted Application added projects: Discovery, Discovery-Search. · View Herald TranscriptSep 30 2016, 8:27 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
dcausse updated the task description. (Show Details)Sep 30 2016, 8:39 AM
mforns added a subscriber: mforns.Sep 30 2016, 7:29 PM

Maybe you could also use the webrequest table in Hadoop. Either via hiveql or spark? And parse the search request's urls for modifiers.

debt triaged this task as Medium priority.Sep 30 2016, 7:38 PM
debt moved this task from needs triage to This Quarter on the Discovery-Search board.
debt added a subscriber: EBernhardson.

@mforns yes I think we could, the problem is that we'll have to "re-parse" the search query to extract special keywords, I think we have an UDF that extracts some of the features but it's no exhaustive. I think it'd be easier to use the state we have in mediawiki and pass it to CirrusSearchSearchRequestSet, In this table we have a generic payload attribute (map<string,string>) that we use sometimes, this way we won't have to re-implement some part of the parsing logic in hive.

Jan_Dittrich added a subscriber: Jan_Dittrich.
debt moved this task from This Quarter to Up Next on the Discovery-Search board.Oct 4 2016, 5:44 PM
Smalyshev claimed this task.Dec 6 2016, 6:40 PM

Change 325821 had a related patch set uploaded (by Smalyshev):
Report all syntax in stats, also add syntax to the log

Change 325821 merged by jenkins-bot:
Report all syntax in stats, also add syntax to the log

Deskana closed this task as Resolved.Jan 6 2017, 10:53 PM