Last message on logstash was at 2019-06-27T01:09:29 and the next one 2019-06-27T06:35:13 (going with @timestamp from logstash, however the dt field for the lagged messages is correct (i.e around 1.10)
Last 5xx:
@timestamp     2019-06-27T01:09:41 dt     2019-06-27T01:09:29
Then a single message at
@timestamp     2019-06-27T02:45:58 dt     2019-06-27T01:09:32
Then resumed at
@timestamp     2019-06-27T06:35:13 dt     2019-06-27T01:10:22
Other than the fact that rsyslog didn't self recover, there's also these points we'll need to address:
- rsyslog omfwd to central syslog failed, but omkafka delivery failed too, and they should fail independently instead
- rsyslog omfwd actions should be given an explicit names so we have metrics for them (e.g. action_failed / action_processed)
Starting with https://gerrit.wikimedia.org/r/c/operations/puppet/+/520012