Page MenuHomePhabricator

Alert on no (or "few") logs indexed (was: No logs ingested in logstash7 since 2020-07-06 19:23)
Closed, ResolvedPublic

Description

If there's no logs indexed we should alert.

Looks like after I90d4951d0 the logstash7 cluster has stopped being able to pull logs from kafka:

  Plugin: <LogStash::Inputs::Kafka codec=><LogStash::Codecs::JSON id=>"json_4db6c2df-604a-4d27-9bca-4a3c3a6126bf", enable_metric=>true, charset=>"UTF-8">, group_id=>"logstash7-eqiad", topics=>["eqiad.eventgate-logging-external.error.validation"], ssl_truststore_location=>"/etc/logstash/kafka_logging-eqiad.truststore.jks", ssl_truststore_password=><password>, consumer_threads=>3, security_protocol=>"SSL", id=>"input/kafka/eventgate-logging-external-validation-error-eqiad", type=>"eventgate_validation_error", bootstrap_servers=>"logstash1010.eqiad.wmnet:9093,logstash1011.eqiad.wmnet:9093,logstash1012.eqiad.wmnet:9093", tags=>["input-kafka-eventgate-logging-external-validation-error-eqiad", "kafka", "es", "eventgate"], enable_metric=>true, auto_commit_interval_ms=>"5000", client_id=>"logstash", enable_auto_commit=>"true", key_deserializer_class=>"org.apache.kafka.common.serialization.StringDeserializer", value_deserializer_class=>"org.apache.kafka.common.serialization.StringDeserializer", poll_timeout_ms=>100, ssl_endpoint_identification_algorithm=>"https", sasl_mechanism=>"GSSAPI", decorate_events=>false>
  Error: SSL handshake failed
  Exception: Java::OrgApacheKafkaCommonErrors::SslAuthenticationException
  Stack: 
[2020-07-07T09:40:09,503][ERROR][logstash.javapipeline    ] A plugin had an unrecoverable error. Will restart this plugin.
  Pipeline_id:main

Event Timeline

Change 610008 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] logstash: fix kafka input ssl configuration for eventgate validation errors

https://gerrit.wikimedia.org/r/610008

Change 610008 merged by Filippo Giunchedi:
[operations/puppet@production] logstash: fix kafka input ssl configuration for eventgate validation errors

https://gerrit.wikimedia.org/r/610008

This is fixed now, though no alerts fired when no logs were ingested so I'll take over the task to fix that too

fgiunchedi renamed this task from No logs ingested in logstash7 since 2020-07-06 19:23 to Alert on no (or "few") logs indexed (was: No logs ingested in logstash7 since 2020-07-06 19:23).Jul 7 2020, 1:19 PM
fgiunchedi updated the task description. (Show Details)

Change 615164 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] profile: add alert on no logs ingested

https://gerrit.wikimedia.org/r/615164

Change 615164 merged by Filippo Giunchedi:
[operations/puppet@production] profile: add alert on no logs ingested

https://gerrit.wikimedia.org/r/615164

fgiunchedi claimed this task.

All done! We're alerting on no/low logs ingested into ES