Page MenuHomePhabricator

Provide centralized logging (logstash) for Toolforge
Open, MediumPublic


One log to rule them all.

It would be good to have logstash for at least tools-ops logs, which includes

  • basic system logging (dmesg / syslog)
  • all the mails that now end up in my inbox ;-)
  • infra logging (puppet, apt, diamond)
  • mail (exim on tools-mailrelay)
  • SGE (what kind of logs do we have there?)
  • bigbrother actions
  • Redis?
  • ssh (also useful to help people with issues logging in)

Most importantly in practice would be the logs that relate to warnings from shinken-wm, which includes:

  • puppet staleness/failures
  • ssh

and the issues we get mails about, which are:

  • sge
  • apt
  • raid??
  • exim paniclog

Event Timeline

valhallasw raised the priority of this task from to Medium.
valhallasw updated the task description. (Show Details)
valhallasw added a project: Toolforge.
valhallasw added subscribers: coren, scfc, Aklapper, yuvipanda.

10:09 valhallasw`cloud: created toolsbeta-logstash to play around with logstash and figure out what we need for tools (phab:T97861)
10:25 valhallasw`cloud: set Hiera variable "elasticsearch::cluster_name": toolsbeta-logstash-eqiad
10:30 valhallasw`cloud: pulled new changes into puppetmaster to get in
10:37 valhallasw`cloud: that doesn't seem to be applied... setting has_ganglia: false manually in wikitech hiera
11:11 valhallasw`cloud: commenting out include ::elasticsearch::ganglia in role::logstash seems to work. I think we have to write our own tools logstash roles anyway in the end, as the role::logstash code contains e.g. mediawiki specific code

Unfortunately, logstash doesn't actually start and crashes with

Errno::EBADF: Bad file descriptor - Bad file descriptor
          close at org/jruby/
        connect at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:173
           each at org/jruby/
        connect at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:139
        connect at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:406
           call at org/jruby/
          fetch at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/pool.rb:48
        connect at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:403
        execute at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:319
           get! at /opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:217
       register at /opt/logstash/lib/logstash/outputs/elasticsearch_http.rb:117
           each at org/jruby/
   outputworker at /opt/logstash/lib/logstash/pipeline.rb:220
  start_outputs at /opt/logstash/lib/logstash/pipeline.rb:152

Which was because elasticsearch wasn't started. OK, that starts logstash, but that doesn't actually give us an interface yet...

fgiunchedi renamed this task from Provide centralized logging (logstash) to Provide centralized logging (logstash) for Toolforge.Oct 1 2018, 1:13 PM
fgiunchedi removed a project: Cloud-Services.
fgiunchedi added a subscriber: fgiunchedi.

Unlinking from T198756 as Toolforge is out of scope for the current goals, though the design/implementation can be equally applied to Toolforge as well.

Random note that, at this point, one of the only multitenant solutions for this kind of thing that is open source seems to be

Re-opening this as it isn't really a duplicate. Instead both this and the other one should be under another task.

lmata moved this task from Radar to Inbox on the observability board.
lmata added a subscriber: lmata.

Hello, Is there something for us (o11y) here or should we just stay in the loop for potential collaboration? Subscribing and radar for now.

This is something to discuss and potentially collaborate on. I'll follow-up with you.