Page MenuHomePhabricator

Logstash/Kibana architecture review
Closed, ResolvedPublic

Description

Review Logstash/Kibana's architecture and installation and identify next steps and gaps to be addressed.

Event Timeline

herron triaged this task as Medium priority.Jul 5 2018, 5:39 PM

On Friday 6th, we had a meeting with Infrastructure Foundations and former logstash maintainers (e.g. Search, Bryan Davis) to go over the current architecture and pain points. Notes at https://etherpad.wikimedia.org/p/logstash-sre-q1-fy2018-2019 and I'm summarizing below:

  1. One of the biggest hurdles is explosion of indices due to logstash mapping every json field found in logs (T180051)
  2. ApiFeatureUsage uses logstash and its logs end up in cirrus ES cluster instead
  3. The upgrade to ES 6 will remove mapping types, we'll need to put some thought on how to address that
  4. Current architecture is four years old, nowadays we'd probably use some queues like kafka
  5. ApiFeatureUsage has some tech debt, namely it can block because it outputs to cirrussearch ES cluster and would need a buffer inbetween instead

Non exhaustive list of things that we'll need to address:

Non exhaustive list of things that we'll need to address:

In particular for logstash we'll also need to add suitable id configuration to tell different components apart in the monitoring API (e.g. https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html#plugins-filters-grok-id)

fgiunchedi claimed this task.

The architecture and gaps review has been carried out as part of the logging infrastructure design document