Page MenuHomePhabricator

[toolforge.infra] Provide centralized logging (logstash) for Toolforge
Open, MediumPublic

Description

One log to rule them all.

It would be good to have logstash for at least tools-ops logs, which includes

  • basic system logging (dmesg / syslog)
  • all the mails that now end up in my inbox ;-)
  • infra logging (puppet, apt, diamond)
  • mail (exim on tools-mailrelay)
  • SGE (what kind of logs do we have there?)
  • bigbrother actions
  • Redis?
  • ssh (also useful to help people with issues logging in)

Most importantly in practice would be the logs that relate to warnings from shinken-wm, which includes:

  • puppet staleness/failures
  • ssh

and the issues we get mails about, which are:

  • sge
  • apt
  • raid??
  • exim paniclog

Event Timeline

valhallasw raised the priority of this task from to Medium.
valhallasw updated the task description. (Show Details)
valhallasw added a project: Toolforge.
valhallasw added subscribers: coren, scfc, Aklapper, yuvipanda.

10:09 valhallasw`cloud: created toolsbeta-logstash to play around with logstash and figure out what we need for tools (phab:T97861)
10:25 valhallasw`cloud: set Hiera variable "elasticsearch::cluster_name": toolsbeta-logstash-eqiad
10:30 valhallasw`cloud: pulled new changes into puppetmaster to get https://github.com/wikimedia/operations-puppet/commit/4afd23d8e2905a84ef211ad92e8314173eb743ba in
10:37 valhallasw`cloud: that doesn't seem to be applied... setting has_ganglia: false manually in wikitech hiera
11:11 valhallasw`cloud: commenting out include ::elasticsearch::ganglia in role::logstash seems to work. I think we have to write our own tools logstash roles anyway in the end, as the role::logstash code contains e.g. mediawiki specific code

Unfortunately, logstash doesn't actually start and crashes with

Errno::EBADF: Bad file descriptor - Bad file descriptor
          close at org/jruby/RubyIO.java:2097
        connect at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:173
           each at org/jruby/RubyArray.java:1613
        connect at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:139
        connect at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:406
           call at org/jruby/RubyProc.java:271
          fetch at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/pool.rb:48
        connect at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:403
        execute at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:319
           get! at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:217
       register at /opt/logstash/lib/logstash/outputs/elasticsearch_http.rb:117
           each at org/jruby/RubyArray.java:1613
   outputworker at /opt/logstash/lib/logstash/pipeline.rb:220
  start_outputs at /opt/logstash/lib/logstash/pipeline.rb:152

Which was because elasticsearch wasn't started. OK, that starts logstash, but that doesn't actually give us an interface yet...

fgiunchedi renamed this task from Provide centralized logging (logstash) to Provide centralized logging (logstash) for Toolforge.Oct 1 2018, 1:13 PM
fgiunchedi removed a project: Cloud-Services.
fgiunchedi subscribed.

Unlinking from T198756 as Toolforge is out of scope for the current goals, though the design/implementation can be equally applied to Toolforge as well.

Random note that, at this point, one of the only multitenant solutions for this kind of thing that is open source seems to be https://grafana.com/docs/loki/latest/overview/

Re-opening this as it isn't really a duplicate. Instead both this and the other one should be under another task.

lmata moved this task from Radar to Inbox on the observability board.
lmata subscribed.

Hello, Is there something for us (o11y) here or should we just stay in the loop for potential collaboration? Subscribing and radar for now.

This is something to discuss and potentially collaborate on. I'll follow-up with you.

dcaro renamed this task from Provide centralized logging (logstash) for Toolforge to [toolforge.infra] Provide centralized logging (logstash) for Toolforge.Feb 21 2024, 10:20 AM
dcaro reopened this task as Open.
dcaro added subscribers: yuvipanda, EBernhardson.