Page MenuHomePhabricator

PoC alert/notification functionality with Elastic Stack
Open, NormalPublic

Description

Referred to in T123243 and T211700 there has been talk for some time of looking into https://github.com/Yelp/elastalert (or alternatives?) for alerting and correlation of logs (mentioned in the logging design doc as well). One of the ideas here is that this replaces the work done in T208611 (which will make @Volans very happy)

I'm going to try to workshop this out a bit in the logging cloud project and then possibly move demo functionality to deployment-prep depending on how things go.

Event Timeline

chasemp triaged this task as Normal priority.Jan 16 2019, 3:25 PM
chasemp created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 16 2019, 3:25 PM
EBjune added a subscriber: Gehel.Jan 16 2019, 4:24 PM
chasemp added a project: Restricted Project.Apr 1 2019, 7:33 PM

Change 502773 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] WIP elastalert module

https://gerrit.wikimedia.org/r/502773

Change 503014 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] aptrepo: add component/elastalert

https://gerrit.wikimedia.org/r/503014

chasemp reassigned this task from chasemp to fgiunchedi.Apr 11 2019, 7:29 PM

Reassiging to reflect the reality of Filippo's awesomeness

Change 503014 merged by Filippo Giunchedi:
[operations/puppet@production] aptrepo: add component/elastalert

https://gerrit.wikimedia.org/r/503014

fgiunchedi moved this task from Backlog to Doing on the User-fgiunchedi board.Apr 23 2019, 12:28 PM

Change 505762 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] WIP elastalert: enable on logstash1007

https://gerrit.wikimedia.org/r/505762

Elastalert is running on deployment-logstash2 now (I had to fudge with it a little because the instance is jessie (cfr T218729)) but other than that it'll work like in production (i.e. with https://gerrit.wikimedia.org/r/c/operations/puppet/+/505762 and https://gerrit.wikimedia.org/r/c/operations/puppet/+/502773 merged in production, as opposed to cherry-picked on deployment-prep puppetmaster)

Rules are only on the host itself for experimentation purposes, for the first iteration we'll have the rules in private.git and possibly in the future in a separate rules private (in the sense of gerrit access) git repository to enable self-service.

The service name is elastalert@security and config / rules live in /etc/elastalert/security. I left a badpass.yaml example file, feel free to change/tweak as needed! cc @Dsharpe and let me know how we can help!

fgiunchedi moved this task from Doing to Radar on the User-fgiunchedi board.May 13 2019, 8:56 AM