Page MenuHomePhabricator

elk7: fields indexed without position data; cannot run PhraseQuery
Open, MediumPublicBUG REPORT

Description

In T247014 an updated template was deployed which set "index_options": "docs" on many fields. This appears to be causing a side effect where some visualizations now have shard failures due to fields indexed without position data; cannot run PhraseQuery

This is occurring in the below dashboards:

DBQuery: "reason": "field:[message] was indexed without position data; cannot run PhraseQuery"
Filtered Fatal Monitor: "reason": "field:[message] was indexed without position data; cannot run PhraseQuery"
Network Overview:  "reason": "field:[normalized_message] was indexed without position data; cannot run PhraseQuery"
Gerrit: "reason": "field:[program] was indexed without position data; cannot run PhraseQuery"
Parsoid-PHP: "reason": "field:[channel] was indexed without position data; cannot run PhraseQuery"
Performance Team: "reason": "field:[exception.message] was indexed without position data; cannot run PhraseQuery"
Reading Web: "reason": "field:[message] was indexed without position data; cannot run PhraseQuery"
Varnish Fetch Errors: "reason": "field:[fetcherror] was indexed without position data; cannot run PhraseQuery"
Wikidata Query Service: "reason": "field:[program] was indexed without position data; cannot run PhraseQuery"
api-feature-usage: "reason": "field:[channel] was indexed without position data; cannot run PhraseQuery"
api-feature-usage-http: "reason": "field:[channel] was indexed without position data; cannot run PhraseQuery"
cassandra-ops: "reason": "field:[cluster] was indexed without position data; cannot run PhraseQuery"
cassandra-dev: "reason": "field:[cluster] was indexed without position data; cannot run PhraseQuery"
change-prop: "reason": "field:[normalized_message] was indexed without position data; cannot run PhraseQuery"
echostore: "reason": "field:[msg] was indexed without position data; cannot run PhraseQuery"
labweb: "reason": "field:[channel] was indexed without position data; cannot run PhraseQuery"
logged constraint checks: "reason": "field:[loggingMethod] was indexed without position data; cannot run PhraseQuery"
mediawiki: "reason": "field:[channel] was indexed without position data; cannot run PhraseQuery"
mediawiki-errors:  "reason": "field:[exception.trace] was indexed without position data; cannot run PhraseQuery"
mediawiki-new-errors: "reason": "field:[exception.message] was indexed without position data; cannot run PhraseQuery"
parsoid: "reason": "field:[levelPath] was indexed without position data; cannot run PhraseQuery"
parsoid-tests: "reason": "field:[levelPath] was indexed without position data; cannot run PhraseQuery"
readinglists-restbase-errors: "reason": "field:[root_req_uri] was indexed without position data; cannot run PhraseQuery"
restbase-unknown-errors: "reason": "field:[message] was indexed without position data; cannot run PhraseQuery"
scap canary: "reason": "field:[message] was indexed without position data; cannot run PhraseQuery"
sessionstore: "reason": "field:[msg] was indexed without position data; cannot run PhraseQuery"
wikitech logins: "reason": "field:[normalized_message] was indexed without position data; cannot run PhraseQuery"
xxx Effie: "reason": "field:[message] was indexed without position data; cannot run PhraseQuery"

Event Timeline

Change 583112 had a related patch set uploaded (by Herron; owner: Herron):
[operations/puppet@production] elk7: remove index_options:docs from logstash v7 template

https://gerrit.wikimedia.org/r/583112

In my testing simply removing instances of "index_options":"docs" from the logstash template addresses the issue, please see https://gerrit.wikimedia.org/r/583112

Change 583112 merged by Herron:
[operations/puppet@production] elk7: remove index_options:docs from logstash v7 template

https://gerrit.wikimedia.org/r/583112

jcrespo triaged this task as Medium priority.Apr 2 2020, 1:48 PM
jcrespo added a subscriber: jcrespo.

Feel free to update priority (and assign it to yourself), this is just a guess, just triaging to avoid unnoticed untriaged ops tasks.