The zero results rate for full text search has increased recently, after a decrease that was sustained for a while (see screenshot below). We should investigate why this has happened.
Description
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | mpopov | T132503 Update ZRR data collection to exclude irrelevant/invalid Cirrus requests | |||
| Resolved | mpopov | T131196 Research why the zero results rate for full text search is increasing |
Event Timeline
Okay, it appears that (for whatever reason), more_like requests are messing with ZRR for full-text pretty hard. See difference between (b) and (d):
Note: "fixed" in (b) refers to me excluding irrelevant query_types (e.g. "send_data_write", "send_deletes") because those are currently being included in the calculation of Prefix zero results rate due to the lack of documentation about the dataset & lack of communication between engineers & analysts. We will need to meet at some point to discuss this and make sure the dashboards are collecting the appropriate data.
Now, I don't know all the ins and outs of the behind-the-scenes stuff but this is really concerning to me:
@dcausse & @EBernhardson: Do you have any insights/comments on what we're seeing here?
dcausse: bearloga: trying to digest your data: one of your conclusions, modelike query generate a lot of zero result? dcausse: s/modelike/morelike/ dcausse: ebernhardson: I wonder if the morelike discrepencies on the bearloga data is not caused by the data we store when the morelike query is cached? ebernhardson: dcausse: looking ebernhardson: bearloga: does that say more like ZRR is 75%? (bearloga) ebernhardson: yup ebernhardson: bearloga: well, at least i know that's almost certainly wrong :) looking closer... ebernhardson: bearloga: so, for now i think our best bet would just be to ignore all more_like queries that have payload['cached'] == true ebernhardson: bearloga: or perhaps would be better (would have to look closer), anything with hitstotal: -1? We use -1 in a variety of places to indicate something that can't be found ebernhardson: we should be able to fix this actual logging too though
Investigation complete. Several possible issues and resolutions were identified. I'm resolving this task, and will create separate subtasks for each issue.


