Page MenuHomePhabricator

Monitor database hosts' disk performance
Closed, ResolvedPublic

Description

Via the new ganglia::diskstat.
Disk performance is obviously important for database (isn't it?) and I remember Tim saying he missed a check he had added a while ago.


Version: unspecified
Severity: enhancement

Details

Reference
bz55406

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
Resolvedjcrespo
Resolvedhashar

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:12 AM
bzimport set Reference to bz55406.
bzimport added a subscriber: Unknown Object (MLST).
jcrespo claimed this task.
jcrespo added subscribers: LSobanski, jcrespo.

While there is no disk-specific alerts (other than RAID health checks, which probably should be enough) there is almost complete metrics gathered of io stats both on the MySQL grafana dashboards: https://grafana.wikimedia.org/d/000000273/mysql?orgId=1 (disk latency, throughput in bytes and iops) and on the host stats: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=db1083&var-datasource=thanos&var-cluster=mysql

It is true that the host disk stats are not great (disk utilization is not very useful for real-world problems), but that is just a presentation issue that I am not pushing to improve.

Given the vagueness of the ticket, I would consider this as "Done" when prometheus monitoring was implemented.