The Logstash filters are currently not tested for regressions, in other words when changing filters it is very easy to also introduce regressions and not be aware.
I have run into https://github.com/magnusbaeck/logstash-filter-verifier and seems promising: test cases can be specified in yaml or json in the "input line" + "expected output" format.
Things to do:
- Get consensus on https://gerrit.wikimedia.org/r/c/operations/puppet/+/594460
- Get CI to run the tests
- Build a Debian package of logstash-filter-verifier
- Add the Debian package and Logstash to a docker image (puppet's? another?) for CI to run
- Document how to write tests https://wikitech.wikimedia.org/wiki/Logstash#Writing_&_testing_filters
- Ensure we have tests to cover enough existing crucial filters