This is an umbrella task to track what was discussed in the design documents, with the subsequent implementation plan:
- pt-heartbeat + scaffolding
- Create the prometheus http exporter scaffolding
- Implement the custom pt-heartbeat monitoring
- Create the related alert rule(s)
- seconds_behind_master + threads (replication/io)
- Add the show slave status; parsing
- Create the related prometheus-node-exporter alert rule(s)
- memory pressure
- *Implement custom memory monitoring if needed*
- Create the related alert rule(s)
- disk pressure
- *Implement custom disk monitoring if needed*
- Create the related alert prometheus-node-exporter rule(s)
- read only status
- *Implement custom query if needed*
- Create the related mysqld-exporter alert rule(s)
- process monitoring
- *Implement custom system probe if needed*
- Create the related systemd unit alert rule(s)
- mariadb errors
- decide upon feature parity → this task will cost time, as it requires custom designs and gets out of prometheus "canonical" use case
- This task will be a bit longer than the others as it requires a bit more testing
- Implement a proof of concept of error message passing
- Productionize the POC
- Create the related alert rule(s)