A non-UTF8-encoded (or mixed encoding) message causes the Logstash main thread to crash. SystemD dutifully restarts Logstash when it happens, but it occurs on a loop as the message is remains at that offset in Kafka when the pipeline starts up again.
Evidenced in:
[ERROR][logstash.pipeline] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash. { "exception"=>"invalid byte sequence in UTF-8", "backtrace"=> [ "org/jruby/RubyString.java:6007:in `rstrip!'", "org/jruby/RubyString.java:6094:in `strip!'", "org/jruby/RubyString.java:6085:in `strip'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-mutate-3.3.4/lib/logstash/filters/mutate.rb:496:in `strip'", "org/jruby/RubyArray.java:1613:in `each'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-mutate-3.3.4/lib/logstash/filters/mutate.rb:490:in `strip'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-mutate-3.3.4/lib/logstash/filters/mutate.rb:257:in `filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:145:in `do_filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:164:in `multi_filter'", "org/jruby/RubyArray.java:1613:in `each'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:161:in `multi_filter'", "/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:46:in `multi_filter'", "(eval):57894:in `initialize'", "org/jruby/RubyArray.java:1613:in `each'", "(eval):57891:in `initialize'", "org/jruby/RubyProc.java:281:in `call'", "(eval):58037:in `initialize'", "org/jruby/RubyArray.java:1613:in `each'", "(eval):58029:in `initialize'", "org/jruby/RubyProc.java:281:in `call'", "(eval):5886:in `filter_func'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:398:in `filter_batch'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:379:in `worker_loop'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:342:in `start_workers'" ] }
@bd808 indicated that the error is probably coming from MediaWiki database logging for wikis where the encoding is not UTF-8 (or UTF-8 compatible).
This task is complete when the pipeline no longer crashes at MediaWiki outages.