Page MenuHomePhabricator

MariaDB monitoring transition out of icinga
Closed, DuplicatePublic

Description

This is an umbrella task to track what was discussed in the design documents, with the subsequent implementation plan:

  1. pt-heartbeat + scaffolding
    • Create the prometheus http exporter scaffolding
    • Implement the custom pt-heartbeat monitoring
    • Create the related alert rule(s)
  2. seconds_behind_master + threads (replication/io)
    • Add the show slave status; parsing
    • Create the related prometheus-node-exporter alert rule(s)
  3. memory pressure
    • *Implement custom memory monitoring if needed*
    • Create the related alert rule(s)
  4. disk pressure
    • *Implement custom disk monitoring if needed*
    • Create the related alert prometheus-node-exporter rule(s)
  5. read only status
    • *Implement custom query if needed*
    • Create the related mysqld-exporter alert rule(s)
  6. process monitoring
    • *Implement custom system probe if needed*
    • Create the related systemd unit alert rule(s)
  7. mariadb errors
    • decide upon feature parity → this task will cost time, as it requires custom designs and gets out of prometheus "canonical" use case
    • This task will be a bit longer than the others as it requires a bit more testing
      • Implement a proof of concept of error message passing
      • Productionize the POC
    • Create the related alert rule(s)