Page MenuHomePhabricator

config file change canarying for logstash
Closed, ResolvedPublic

Description

It's possible for Puppet to deploy a bad configuration file and crash logstash everywhere. Oof.

While there is a hypothetically-easy fix of using logstash --config.test_and_exit this does not catch all errors that will cause a crash on startup.

Event Timeline

colewhite triaged this task as Medium priority.Apr 16 2019, 3:39 PM

Change 587704 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] logstash: validate config files

https://gerrit.wikimedia.org/r/587704

Change 587705 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] logstash: log safepoints only when running the daemon

https://gerrit.wikimedia.org/r/587705

Change 587704 merged by Filippo Giunchedi:
[operations/puppet@production] logstash: validate config files

https://gerrit.wikimedia.org/r/587704

Puppet will ask Logstash to validate individual config files before installing them via validate_cmd. The puppet run will fail if e.g. syntax is invalid and the new config file won't be installed. Likely Good Enough™ for now to bandaid production Logstash outages. Ideally we can test files at CI/review time as well to catch mistakes earlier.

Change 587705 merged by Filippo Giunchedi:
[operations/puppet@production] logstash: log safepoints only when running the daemon

https://gerrit.wikimedia.org/r/587705

fgiunchedi claimed this task.

I'm going to boldly resolve this, we're testing Logstash config at puppet run time now, which is meant to at least prevent logstash from starting with a syntactically invalid configuration.