As noted in T131196#2200560, our data fetching scripts current treat a lot of requests inappropriately. We need to filter out irrelevant query_types (e.g. send_data_write). Specifically, we need to make the following changes:
query_type | How it appears on dashboard | change or keep |
comp_suggest | Prefix Search | keep as is |
count_links | Prefix Search | ? |
degraded_full_text | Full-Text Search | ? |
full_text | Full-Text Search | keep as is |
GeoData_spatial_search | Prefix Search | exclude? |
get | Prefix Search | exclude? |
more_like | Full-Text Search | keep as is? maybe ignore those where payload['cached'] == true? |
namespace | Prefix Search | keep as is |
near_match | Prefix Search | keep as is? |
other_idx_lookup | Prefix Search | ? |
prefix | Prefix Search | keep as is |
regex | Full-Text Search | keep as is |
send_data_other_idx_write | Prefix Search | exclude |
send_data_write | Prefix Search | exclude |
send_deletes | Prefix Search | exclude |
version | Prefix Search | exclude |
Additionally:
ebernhardson: bearloga: yea on looking, the most resilient way would probably be to treat anything with hitstotal: -1 (in hive you can use array_contains(requests.hitstotal, -1)) as "unknown", could have hits or not. This looks like it would filter out index writes (send_data_write) and and cached more like