Right now the only way to monitor the health of any service is to write manually a new check that will get added to icinga in puppet, and we need to do that for each endpoint, for every service. We want to have a general, simple way of automating this process.
Ideally:
- Developers should be able to programmatically expose what they want to be monitored, and what the expected responses will be
- Said resources should be monitored explicitly and independently, without the need of any ops intervention or puppet modification
The way to do this is to have all services expose automatically a special endpoint, like /_monitor, with a simple JSON description of all relevant endpoints and which call we should make to monitor them, what payload to use, and what response to expect
A possible format for the JSON could be:
{
"/path/to/endpoint" : {
"method": "GET",
"request": {
"headers": {
"header1": "value1",
"headerN": "valueN"
},
"body": {}
},
"response" {
"status": 200,
"headers": {
"header1": "value1",
...
},
"body": "/foo.*bar$/"
},
...
}All of the fields are optional, as by default the monitoring system will check the url with a GET request, and expect a 200 response code.