Page MenuHomePhabricator

Add monitoring for detecting when logstash services are down
Closed, InvalidPublic

Description

See T141776: Parsoid's service-runner events not showing up in Kibana since 2016-07-28 21:00 UTC. I first noticed this Saturday (30th July noon CT), but thought it was only Parsoid.

Looks like some monitoring of logstash service would be useful to ensure that we catch logstash service downtimes.

Event Timeline

ssastry renamed this task from Add monitoring for ensuring logstash services are operational to Add monitoring for detecting when logstash services are down.Aug 1 2016, 4:17 PM

We have monitoring for the service proper, but what has seemed to happen several times is that the java process gets hung up in a gc cycle and stops processing events.

akosiaris triaged this task as Medium priority.Aug 11 2016, 2:33 PM
fgiunchedi subscribed.

I don't think we've seen reoccurrence of this, also logstash now has monitoring for udp packet loss which I'm assuming would also show up if logstash services are down.