Page MenuHomePhabricator

Systemd unit for logstash
Closed, ResolvedPublic

Description

The upstream package we use does not include a systemd unit file. The automatically generated unit file does not restart the service when it crashes.

$ systemctl cat logstash.service
# /run/systemd/generator.late/logstash.service
# Automatically generated by systemd-sysv-generator

[Unit]
SourcePath=/etc/init.d/logstash
Description=LSB: Starts Logstash as a daemon.
Before=runlevel2.target runlevel3.target runlevel4.target runlevel5.target shutd
After=remote-fs.target systemd-journald-dev-log.socket
Conflicts=shutdown.target

[Service]
Type=forking
Restart=no
TimeoutSec=5min
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
SysVStartPriority=3
ExecStart=/etc/init.d/logstash start
ExecStop=/etc/init.d/logstash stop

Related Objects

Event Timeline

bd808 triaged this task as High priority.Feb 22 2016, 2:48 AM

Mentioned in SAL [2016-02-25T00:57:55Z] <bd808> Started crashed Logstash process on logstash1002 (systemd doesn't restart authomatically due to T127677)

Maybe some nice soul from SRE can give me tips on how to make a custom unit file to put in our Puppet config?

I wrote an systemd unit based on the current init script. It's totally untested, though!
https://phabricator.wikimedia.org/P2671

Some notes:

  • In the sysvinit script stdout and stderr were redirected to logfiles, that's intentionally not supported with the systemd directives: https://lists.freedesktop.org/archives/systemd-devel/2012-March/004703.html Instead it's here sent to journalctl here, which is quite convenient. But we can also retain the old behaviour with amending the ExecStart line
  • The sysv init script nices the logstash daemon to 19, we can do the same with "Nice=19", but I left it out since it seemed somehow cargo cult. Is that really needed?
  • systemd has some nice features to restrict the logstash daemon (in case the process gets compromised e.g.). We can add these once the basic service is running fine.

I wrote an systemd unit based on the current init script. It's totally untested, though!
https://phabricator.wikimedia.org/P2671

Cool! Let's test it out in the beta cluster install. We can do that by turning your paste into a proper Puppet patch and cherry-picking that on deployment-puppetmaster.

Some notes:

As long as we can review the log output I can live without /var/log/logstash files. I think something like journalctl -f -u logstash would cover most of what I've done with the on disk logs from Logstash.

  • The sysv init script nices the logstash daemon to 19, we can do the same with "Nice=19", but I left it out since it seemed somehow cargo cult. Is that really needed?

I bet that is in the default scripts for installs where Logstash is being used as a log shipper service on a host that is primarily doing some other work. For our use cases I don't see the point in renicing.

  • systemd has some nice features to restrict the logstash daemon (in case the process gets compromised e.g.). We can add these once the basic service is running fine.

Sounds like a good plan. Implement, inspect and adapt. :)

Ok, I'll turn this into a proper gerrit patch.

2x logstashes (1001 + 1002) pretty much simultaneously crashed today with errors like:

Error: Your application used more memory than the safety cap of 500M.
Specify -J-Xmx####m to increase it (#### = cap size in MB).
Specify -w for full OutOfMemoryError stack trace
MoritzMuehlenhoff renamed this task from Auto generated Logstash unit file has "Restart=no" to Systemd unit for logstash.Mar 24 2016, 11:21 AM

This is running on logstash100[1-3] for a few hours now. I've also send a pull request to github.com/elastic/logstash (seems they have a CLA, though).