When looking at the Prometheus CPU graph for a machine running guests, the CPU graph is misleading. The CPU guest is summed with CPU user although user contains guest already. That cause the guest metric to be taken in account twice.
An example is an OpenStack compute node (but Ganetichost probably have the same issue):
https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=ganeti1005
proc(5) documentation about /proc/stat does not mention guest being included in user. But /proc/[pid]/stat does:
*cutime* ... This includes guest time, cguest_time (time spent running a virtual CPU).
From Linux:
void account_guest_time(struct task_struct *p, u64 cputime) { u64 *cpustat = kcpustat_this_cpu->cpustat; /* Add guest time to process. */ p->utime += cputime; account_group_user_time(p, cputime); p->gtime += cputime; /* Add guest time to cpustat. */ if (task_nice(p) > 0) { cpustat[CPUTIME_NICE] += cputime; cpustat[CPUTIME_GUEST_NICE] += cputime; } else { cpustat[CPUTIME_USER] += cputime; cpustat[CPUTIME_GUEST] += cputime; } }
Eg guest is added to user and guest_nice is added to guest.
Not sure what kind of magic needs to happen in the CPU Graph metrics. Maybe user and nice can be tweaked as:
- user = user - guest
- nice = nice - guest_nice