This is a followup from T234564: Logstash discards messages from MediaWiki if they contain uncommon keys in the $context array and specifically about getting alerted when we're hitting elasticsearch's per-index field limits, which in turn usually indicates a "fields explosion" problem.
Description
Details
Related Objects
- Mentioned In
- T150106: Type collisions in log events causing indexing failures in ELK Elasticsearch
T238344: MediaWiki Math invalid JSON in logs on Restbase server error
T234564: Logstash discards messages from MediaWiki if they contain uncommon keys in the $context array - Mentioned Here
- T238196: Logging fields conflicts (tracking)
T234564: Logstash discards messages from MediaWiki if they contain uncommon keys in the $context array
Event Timeline
Change 548280 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] mtail: add logstash program
Change 548281 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] profile: add mtail to logstash
Change 548280 merged by Filippo Giunchedi:
[operations/puppet@production] mtail: add logstash program
Change 548281 merged by Filippo Giunchedi:
[operations/puppet@production] profile: add mtail to logstash
Change 548975 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: fix mtail::logs location for logstash role
Change 548975 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: fix mtail::logs location for logstash role
Change 550446 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] prometheus: collect logstash mtail metrics
Change 550446 merged by Filippo Giunchedi:
[operations/puppet@production] prometheus: collect logstash mtail metrics
Change 550471 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] logstash: alert on indexing failures
Change 550640 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] mtail: export logstash ES index failure details
Change 550640 merged by Filippo Giunchedi:
[operations/puppet@production] mtail: export logstash ES index failure details
Change 550471 merged by Filippo Giunchedi:
[operations/puppet@production] logstash: alert on indexing failures
This is completed, surges of indexing errors will result in an alert now. Unfortunately the thresholds are a little higher than I expected because of background noise of errors/conflicts (tracked in T238196: Logging fields conflicts (tracking))
Change 550678 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] logstash: move ingestion alerts to be site-local
Change 550678 merged by Filippo Giunchedi:
[operations/puppet@production] logstash: move ingestion alerts to be site-local
Change 552492 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] prometheus: lower threshold for logstash indexing failures
Change 552492 merged by Filippo Giunchedi:
[operations/puppet@production] prometheus: lower threshold for logstash indexing failures