Page MenuHomePhabricator

'LVS connections' graph on Load Balancers dashboard takes a rate of a gauge
Closed, ResolvedPublic

Description

The 'LVS connections eqiad' and similar panels on that dashboard take a rate of a metric that is a gauge -- just the number of open connections to LVS backends at any given instant. This isn't really a meaningful number, and certainly isn't the number that the panel title suggests it is :)

The recording rule is defined here https://gerrit.wikimedia.org/g/operations/puppet/+/production/modules/role/files/prometheus/rules_ops.yml#304

It should probably be an avg2m or a max2m instead of a rate5m?

Details

Related Gerrit Patches:
operations/puppet : productionprometheus: use max5m for node_ipvs gauges

Event Timeline

CDanis created this task.Oct 28 2019, 3:36 PM
Restricted Application added a project: Operations. · View Herald TranscriptOct 28 2019, 3:36 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
CDanis triaged this task as Medium priority.Oct 28 2019, 3:39 PM
ema moved this task from Triage to LoadBalancer on the Traffic board.Oct 28 2019, 3:40 PM

Change 552810 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] prometheus: use max5m for node_ipvs gauges

https://gerrit.wikimedia.org/r/552810

fgiunchedi moved this task from Inbox to In progress on the observability board.Nov 25 2019, 2:00 PM

Change 552810 merged by Filippo Giunchedi:
[operations/puppet@production] prometheus: use max5m for node_ipvs gauges

https://gerrit.wikimedia.org/r/552810

fgiunchedi closed this task as Resolved.Dec 5 2019, 10:35 AM
fgiunchedi claimed this task.
fgiunchedi added a subscriber: fgiunchedi.

Fixed now and 'load balancers' dashboard adjusted