Research why the zero results rate for full text search is increasing
Closed, ResolvedPublic4 Story Points


The zero results rate for full text search has increased recently, after a decrease that was sustained for a while (see screenshot below). We should investigate why this has happened.

Deskana created this task.Mar 29 2016, 9:10 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 29 2016, 9:10 PM
Deskana triaged this task as Normal priority.Mar 29 2016, 9:10 PM

Normal priority, but it's unlikely we'll get to this fast.

Deskana moved this task from Needs triage to Maps on the Discovery board.Mar 29 2016, 9:11 PM
Deskana moved this task from Maps to Analysis on the Discovery board.
Deskana raised the priority of this task from Normal to High.
Deskana added a subscriber: mpopov.

@mpopov I've raised priority on this and thrown it into the sprint.

Deskana moved this task from Analysis to On Sprint Board on the Discovery board.Apr 7 2016, 8:10 PM
mpopov claimed this task.Apr 11 2016, 4:33 PM
mpopov set the point value for this task to 4.
mpopov moved this task from Backlog to In progress on the Discovery-Analysis (Current work) board.

Okay, it appears that (for whatever reason), more_like requests are messing with ZRR for full-text pretty hard. See difference between (b) and (d):

Note: "fixed" in (b) refers to me excluding irrelevant query_types (e.g. "send_data_write", "send_deletes") because those are currently being included in the calculation of Prefix zero results rate due to the lack of documentation about the dataset & lack of communication between engineers & analysts. We will need to meet at some point to discuss this and make sure the dashboards are collecting the appropriate data.

Now, I don't know all the ins and outs of the behind-the-scenes stuff but this is really concerning to me:

@dcausse & @EBernhardson: Do you have any insights/comments on what we're seeing here?

dcausse: bearloga: trying to digest your data: one of your conclusions, modelike query generate a lot of zero result?
dcausse: s/modelike/morelike/
dcausse: ebernhardson: I wonder if the morelike discrepencies on the bearloga data is not caused by the data we store when the morelike query is cached?
ebernhardson: dcausse: looking
ebernhardson: bearloga: does that say more like ZRR is 75%?
(bearloga) ebernhardson: yup
ebernhardson: bearloga: well, at least i know that's almost certainly wrong :) looking closer...
ebernhardson: bearloga: so, for now i think our best bet would just be to ignore all more_like queries that have payload['cached'] == true
ebernhardson: bearloga: or perhaps would be better (would have to look closer), anything with hitstotal: -1?  We use -1 in a variety of places to indicate something that can't be found
ebernhardson: we should be able to fix this actual logging too though
Deskana closed this task as Resolved.

Investigation complete. Several possible issues and resolutions were identified. I'm resolving this task, and will create separate subtasks for each issue.