Page MenuHomePhabricator

Logstash pipeline crashes on non-UTF8 log messages.
Open, MediumPublic

Description

A non-UTF8-encoded (or mixed encoding) message causes the Logstash main thread to crash. SystemD dutifully restarts Logstash when it happens, but it occurs on a loop as the message is remains at that offset in Kafka when the pipeline starts up again.

Evidenced in:

[ERROR][logstash.pipeline] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash. 
{
    "exception"=>"invalid byte sequence in UTF-8",
    "backtrace"=> [
        "org/jruby/RubyString.java:6007:in `rstrip!'",
        "org/jruby/RubyString.java:6094:in `strip!'",
        "org/jruby/RubyString.java:6085:in `strip'",
        "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-mutate-3.3.4/lib/logstash/filters/mutate.rb:496:in `strip'",
        "org/jruby/RubyArray.java:1613:in `each'",
        "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-mutate-3.3.4/lib/logstash/filters/mutate.rb:490:in `strip'", 
        "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-mutate-3.3.4/lib/logstash/filters/mutate.rb:257:in `filter'",
        "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:145:in `do_filter'",
        "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:164:in `multi_filter'",
        "org/jruby/RubyArray.java:1613:in `each'",
        "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:161:in `multi_filter'",
        "/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:46:in `multi_filter'",
        "(eval):57894:in `initialize'",
        "org/jruby/RubyArray.java:1613:in `each'",
        "(eval):57891:in `initialize'",
        "org/jruby/RubyProc.java:281:in `call'",
        "(eval):58037:in `initialize'",
        "org/jruby/RubyArray.java:1613:in `each'",
        "(eval):58029:in `initialize'",
        "org/jruby/RubyProc.java:281:in `call'",
        "(eval):5886:in `filter_func'",
        "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:398:in `filter_batch'",
        "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:379:in `worker_loop'",
        "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:342:in `start_workers'"
    ]
}

@bd808 indicated that the error is probably coming from MediaWiki database logging for wikis where the encoding is not UTF-8 (or UTF-8 compatible).

This task is complete when the pipeline no longer crashes at MediaWiki outages.

Details

Related Gerrit Patches:

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 23 2019, 8:21 PM

Change 538642 had a related patch set uploaded (by Cwhite; owner: Cwhite):
[operations/puppet@production] profile: add mmutf8fix to kafka output actions

https://gerrit.wikimedia.org/r/538642

There are a few options to consider.

herron triaged this task as Medium priority.Sep 24 2019, 2:50 PM
herron added a subscriber: herron.

IMO mitigation in the logging pipeline is where we should focus, as there are varying applications that could produce malformed messages. Layering safeguards at both the "shipper" and within logstash makes sense to me. Mmutf8fix looks like a good near-term option for the shipper.

One other aspect to think about is splitting pipelines by input/output, so that a logstash pipeline crash wouldn't stop processing of all inputs/outputs. Newer versions provide this via https://www.elastic.co/guide/en/logstash/6.8/multiple-pipelines.html

Change 538642 merged by Cwhite:
[operations/puppet@production] profile: add mmutf8fix to kafka output actions

https://gerrit.wikimedia.org/r/538642