Page MenuHomePhabricator does not receive any mediawiki events
Open, HighPublic

Description is mostly empty.

Looking for events from hosts deployment-logstash2 or deployment-logstash03 , the logstash process there started failing around 2019-08-16 13:25

The first error I found:

[2019-08-16T13:26:00,667][ERROR][logstash.plugins.registry] Problems loading a plugin with

 {:type=>"filter", :name=>"truncate", :path=>"logstash/filters/truncate", :error_message=>"NameError", :error_class=>NameError, :error_backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/plugins/registry.rb:226:in `namespace_lookup'",
"/usr/share/logstash/logstash-core/lib/logstash/plugins/registry.rb:162:in `legacy_lookup'",
"/usr/share/logstash/logstash-core/lib/logstash/plugins/registry.rb:138:in `lookup'",
"/usr/share/logstash/logstash-core/lib/logstash/plugins/registry.rb:180:in `lookup_pipeline_plugin'",
"/usr/share/logstash/logstash-core/lib/logstash/plugin.rb:140:in `lookup'",
"/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:103:in `plugin'",
"(eval):157:in `initialize'",
"org/jruby/ `eval'",
"/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:75:in `initialize'",
"/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:165:in `initialize'",
"/usr/share/logstash/logstash-core/lib/logstash/agent.rb:296:in `create_pipeline'",
"/usr/share/logstash/logstash-core/lib/logstash/agent.rb:95:in `register_pipeline'",
"/usr/share/logstash/logstash-core/lib/logstash/runner.rb:313:in `execute'",
"/usr/share/logstash/vendor/bundle/jruby/1.9/gems/clamp-0.6.5/lib/clamp/command.rb:67:in `run'",
"/usr/share/logstash/logstash-core/lib/logstash/runner.rb:204:in `run'",
"/usr/share/logstash/vendor/bundle/jruby/1.9/gems/clamp-0.6.5/lib/clamp/command.rb:132:in `run'",
"/usr/share/logstash/lib/bootstrap/environment.rb:71:in `(root)'"]}

For recent errors:

[2019-09-17T17:24:33,329][INFO ][logstash.inputs.udp      ] Starting UDP listener {:address=>""}
[2019-09-17T17:24:28,325][ERROR][logstash.inputs.udp      ] UDP listener died {
:exception=>#<Errno::EADDRINUSE: Address already in use - bind - Address already in use>,
"org/jruby/ext/socket/ `bind'",
"/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-3.3.4/lib/logstash/inputs/udp.rb:116:in `udp_listener'",
"/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-3.3.4/lib/logstash/inputs/udp.rb:68:in `run'",
"/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:470:in `inputworker'",
"/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:463:in `start_input'"]

rsyslog is already listening on

deployment-logstash2 is the instance serving

Puppet classes:

role::logstash::collector prometheus_nodes: hiera('prometheus_nodes', [])
role::kibana auth_type: None; vhost: None; auth_realm: undef; auth_file: undef; ldap_authurl: undef; ldap_binddn: undef; require_ssl: true; ldap_groups: []; serveradmin: None


elasticsearch::graylog_hosts: none
elasticsearch::version: 5
  auto_create_index: +logstash-*,-*
  - deployment-logstash2.deployment-prep.eqiad.wmflabs
  cluster_name: labs-logstash-eqiad
  expected_nodes: 1
  heap_memory: 5G
  minimum_master_nodes: 1
  recover_after_nodes: 1
  recover_after_time: 1m
  short_cluster_name: labs-logstash-eqiad
  - deployment-logstash2.deployment-prep.eqiad.wmflabs
profile::kafka::broker::kafka_cluster_name: logging-eqiad
profile::kafka::broker::ssl_enabled: true
profile::kafka::broker::ssl_password: qwerty
profile::logstash::collector::jmx_exporter_port: 7801
puppetmaster: deployment-puppetmaster03.deployment-prep.eqiad.wmflabs

Event Timeline

hashar created this task.Sep 17 2019, 5:26 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 17 2019, 5:26 PM
hashar added subscribers: herron, Krenair.

So the logstash input for udp tries to bind on 11514 but rsyslogd is already listening there.

# netstat -ulnp|grep :11514
udp        0      0*                           9929/rsyslogd       
udp        0      0*                           9929/rsyslogd       
udp6       0      0 ::1:11514               :::*                                9929/rsyslogd       
udp6       0      0 ::1:11514               :::*                                9929/rsyslogd
grep -R 11514 /etc/rsyslog.d/
/etc/rsyslog.d/50-udp-json-logback-compat.conf:input(type="imudp" port="11514" address="localhost" ruleset="ruleset_udp_json_logback_compat")
/etc/rsyslog.d/50-udp-localhost-compat.conf:input(type="imudp" port="11514" address="localhost" ruleset="udp_localhost_to_kafka")

And the same port is used by logstash:

/etc/logstash/conf.d/10-input-udp-logback.conf:    port => 11514
/etc/logstash/conf.d/10-input-udp-logback.conf:    id => "input/udp/11514"
/etc/logstash/conf.d/10-input-udp-logback.conf:    tags => ["input-udp-logback-11514", "deprecated-input"]
/etc/logstash/conf.d/10-input-tcp-json_lines.conf:    port => 11514
/etc/logstash/conf.d/10-input-tcp-json_lines.conf:        id => "input/tcp/11514"
/etc/logstash/conf.d/10-input-tcp-json_lines.conf:    tags => ["input-tcp-json_lines-11514", "deprecated-input"]

Which are meant to relay to kafka? The config files lead me to 23e26e9ca39911305e8af033450f1964a3ade4f5 "logstash: add udp json logback localhost compatibility endpoint" and T213899.

So maybe @herron would know more as to why deployment-logstash03 (Horizon) is broken or maybe @Krenair who created it in May 2019.

Note that deployment-logstash2 is in a similar shape.

The details of where we got to with this logstash03 instance are in the thread from T218729#5153739, I'm slightly surprised this host is

In fact according to openstack-browser, still points at deployment-logstash2 and not deployment-logstash03 at all?

I had not checked which instances was used for One sure thing they are both broken in the same way due to rsyslog / logstash udp input conflicting on port 11514 :]

awight added a subscriber: awight.Sep 18 2019, 9:04 AM
hashar updated the task description. (Show Details)Tue, Sep 24, 3:25 PM

Mentioned in SAL (#wikimedia-releng) [2019-09-24T15:27:36Z] <hashar> deployment-logstash2: removing class profile::elastalert , does not exist and breaks puppet # T233134

On deployment-logstash2 I have killed rsyslogd to let logstash listens on UDP port 11514.

rsyslog listens on that port due to config files:


Which I guess should replace the logstash input which is configured by:


I have no idea how those files are provisioned, sounds to me one of the role applied to the instance should be removed, or maybe the logstash one does not purge old config files.

That is all I get for now. Leaving the instance with logstash listening on 11514.

hashar triaged this task as High priority.Tue, Sep 24, 3:43 PM

My two cents: since deployment-logstash03 has been setup in T218729: Migrate deployment-prep away from Debian Jessie to Debian Stretch/Buster as the 02 stretch replacement, my suggestion would be to work to bring up 03 and ditch 02 in the process

Change 540189 had a related patch set uploaded (by Herron; owner: Herron):
[operations/puppet@production] elasticsearch: add toggle for rsyslog udp logback compat include

Change 540189 abandoned by Herron:
elasticsearch: add toggle for rsyslog udp logback compat include

implemented in a preferred way with I278486fb96878a7a45a608d785625d7db74e9953

herron closed this task as Resolved.Wed, Oct 2, 3:49 PM
herron claimed this task.

This occurred in prod as well after host reboots, and a fix has been deployed in puppet. The fix moves include ::profile::rsyslog::udp_json_logback_compat from profile::elasticsearch to profile::elasticsearch::cirrus which prevents the rsyslog udp 11514 listener from being deployed to logstash collectors where it conflicts with logstash. I think we're in good shape here now, setting this to resolved.

hashar reopened this task as Open.Wed, Oct 2, 5:33 PM

Reopening since: