⚓ T184942 Deprecate python varnish cachestats

Subject	Repo	Branch	Lines +/-
varnish::logging: remove statsd_host and mtail_progs	operations/puppet	production	+1 -11
varnish: remove cachestats.py	operations/puppet	production	+1 -156
varnish: remove varnishreqstats and varnishstatsd	operations/puppet	production	+0 -385
varnish: fix varnishreqstats systemd::service usage	operations/puppet	production	+0 -1
varnish: ensure varnishreqstats is absent	operations/puppet	production	+8 -7
varnish: remove varnishreqstats-based alerts	operations/puppet	production	+0 -92
varnish: remove varnishstatsd	operations/puppet	production	+10 -8
monitoring: update icinga links to varnish-aggregate-client-status-codes	operations/puppet	production	+7 -7
grafana: update varnish-aggregate-client-status-codes to prometheus version	operations/puppet	production	+340 -443
grafana: remove legacy varnish-aggregate-client-status-codes	operations/puppet	production	+1 -570
varnishmedia: post-removal cleanup	operations/puppet	production	+0 -16
varnishmedia: remove python daemon	operations/puppet	production	+7 -95
varnishrls: post-removal cleanup	operations/puppet	production	+0 -16
varnishrls: remove python daemon	operations/puppet	production	+8 -119
prometheus: varnish_thumbnails aggregation rule	operations/puppet	production	+3 -0
prometheus: Add varnishrls aggregation rules	operations/puppet	production	+3 -0
mtail: Use a temporary variable for $cache_control	operations/puppet	production	+4 -3
mtail: Update a /w/load.php test case from a current varnishncsa sample	operations/puppet	production	+3 -2
mtail: Fix varnishrls regex	operations/puppet	production	+9 -9
varnish: varnishxcache post-removal cleanup	operations/puppet	production	+0 -16
varnish: Remove varnishxcache python daemon	operations/puppet	production	+7 -102
varnishxcps: post-removal cleanup	operations/puppet	production	+0 -16
varnishxcps: remove python daemon	operations/puppet	production	+9 -139
mtail: Provide ttfb histogram for varnishbackend	operations/puppet	production	+54 -4
prometheus: varnish_x_cache rate for the last 2m	operations/puppet	production	+1 -1
mtail: Add varnish_resourceloader_resp in varnishrls	operations/puppet	production	+21 -4
Docker: use new operations-puppet image	integration/config	master	+2 -2
Upgrade mtail in operations-puppet	integration/config	master	+9 -0
prometheus: calculate varnish requests daily/weekly averages	operations/puppet	production	+4 -0
prometheus: aggregate varnish_x_cache metrics	operations/puppet	production	+4 -0

Status	Assigned	Task
Resolved	fgiunchedi	T177195 Reduce technical debt in metrics monitoring
Resolved	fgiunchedi	T177199 Add Prometheus client support for varnish/statsd metrics daemons
Resolved	fgiunchedi	T220104 TEC6: Metrics monitoring infrastructure (Q4 2018/19 goal)
Resolved	fgiunchedi	T220116 Migrate all metrics originated by PoPs from statsd to Prometheus
Resolved	fgiunchedi	T184942 Deprecate python varnish cachestats
Resolved	Krinkle	T190978 Update ResourceLoader dashboard to query varnishrls data from Prometheus instead

Change 422381 merged by Vgutierrez:
[operations/puppet@production] mtail: Add varnish_resourceloader_resp in varnishrls

https://gerrit.wikimedia.org/r/422381

Thanks @Vgutierrez !

Krinkle mentioned this in T190978: Update ResourceLoader dashboard to query varnishrls data from Prometheus instead.Mar 28 2018, 11:17 PM

Change 422910 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] prometheus: varnish_x_cache rate for the last 2m

https://gerrit.wikimedia.org/r/422910

Change 422910 abandoned by Vgutierrez:
prometheus: varnish_x_cache rate for the last 2m

Reason:
not needed.. dashboard was using the wrong metric.

https://gerrit.wikimedia.org/r/422910

Change 422155 merged by Vgutierrez:
[operations/puppet@production] mtail: Provide ttfb histogram for varnishbackend

https://gerrit.wikimedia.org/r/422155

Change 421338 merged by Ema:
[operations/puppet@production] varnishxcps: remove python daemon

https://gerrit.wikimedia.org/r/421338

Change 423861 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] varnishxcps: remove nrpe::monitor_service

https://gerrit.wikimedia.org/r/423861

Change 423861 merged by Ema:
[operations/puppet@production] varnishxcps: post-removal cleanup

https://gerrit.wikimedia.org/r/423861

Change 424611 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] varnish: varnishxcache post-removal cleanup

https://gerrit.wikimedia.org/r/424611

Change 421925 merged by Vgutierrez:
[operations/puppet@production] varnish: Remove varnishxcache python daemon

https://gerrit.wikimedia.org/r/421925

Change 424611 merged by Vgutierrez:
[operations/puppet@production] varnish: varnishxcache post-removal cleanup

https://gerrit.wikimedia.org/r/424611

Change 429833 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] varnishmedia: remove python daemon

https://gerrit.wikimedia.org/r/429833

@Krinkle I've pushed https://gerrit.wikimedia.org/r/429833 to remove varnishmedia, my understanding is that there's only one dashboard currently using statsd data under media.thumbnail.varnish. We do have prometheus data that can be used to replace it. Thoughts?

Krinkle mentioned this in T193445: Update Media dashboard in Grafana to use Prometheus metrics.Apr 30 2018, 10:02 PM

Change 431528 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] prometheus: varnish_thumbnails aggregation rule

https://gerrit.wikimedia.org/r/431528

Change 431608 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/puppet@production] mtail: Add test case from current varnishncsa sample

https://gerrit.wikimedia.org/r/431608

@Vgutierrez @ema I'm working on using the Prometheus metrics for the ResourceLoader dashboards but running into an issue with the varnish_resourceloader_inm metrics. Its rate seems to be nearly the same as varnish_resourceloader_resp which cannot be true (it tends be around 25% of requests, based on statsd metrics, as well as based on manual samples from varnishlog I gathered).

Graphite	Prometheus

Change 431712 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] mtail: Fix varnishrls regex [WIP]

https://gerrit.wikimedia.org/r/431712

In T184942#4187502, @Krinkle wrote:

@Vgutierrez @ema I'm working on using the Prometheus metrics for the ResourceLoader dashboards but running into an issue with the varnish_resourceloader_inm metrics. Its rate seems to be nearly the same as varnish_resourceloader_resp which cannot be true (it tends be around 25% of requests, based on statsd metrics, as well as based on manual samples from varnishlog I gathered).

Graphite Prometheus

@Krinkle I t looks like our regex to match inm was too weak and it was capturing all the H2 and TLS info as the inm value, it should be fixed with https://gerrit.wikimedia.org/r/431712

Change 431712 merged by Vgutierrez:
[operations/puppet@production] mtail: Fix varnishrls regex

https://gerrit.wikimedia.org/r/431712

Change 431608 abandoned by Krinkle:
mtail: Update a /w/load.php test case from a current varnishncsa sample

Reason:
Fixed by https://gerrit.wikimedia.org/r/#/c/431712/

https://gerrit.wikimedia.org/r/431608

Change 432090 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/puppet@production] prometheus: Add varnishrls aggregation rules

https://gerrit.wikimedia.org/r/432090

Change 432117 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/puppet@production] mtail: Use a temporary variable for $cache_control

https://gerrit.wikimedia.org/r/432117

Change 432117 abandoned by Krinkle:
mtail: Use a temporary variable for $cache_control

https://gerrit.wikimedia.org/r/432117

Change 432090 merged by Filippo Giunchedi:
[operations/puppet@production] prometheus: Add varnishrls aggregation rules

https://gerrit.wikimedia.org/r/432090

Change 431528 merged by Ema:
[operations/puppet@production] prometheus: varnish_thumbnails aggregation rule

https://gerrit.wikimedia.org/r/431528

Krinkle closed subtask T190978: Update ResourceLoader dashboard to query varnishrls data from Prometheus instead as Resolved.May 24 2018, 11:58 AM

@ema ResourceLoader dashboards in Grafana have been updated to use Prometheus for all Varnish metrics. The varnishrls deamon for Graphite may now be removed.

Change 435739 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] varnishrls: remove python daemon

https://gerrit.wikimedia.org/r/435739

Change 435739 merged by Ema:
[operations/puppet@production] varnishrls: remove python daemon

https://gerrit.wikimedia.org/r/435739

Change 435752 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] varnishrls: post-removal cleanup

https://gerrit.wikimedia.org/r/435752

Change 435752 merged by Ema:
[operations/puppet@production] varnishrls: post-removal cleanup

https://gerrit.wikimedia.org/r/435752

varnishrls removed, thanks @Krinkle.

Change 429833 merged by Ema:
[operations/puppet@production] varnishmedia: remove python daemon

https://gerrit.wikimedia.org/r/429833

Change 465383 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] varnishmedia: post-removal cleanup

https://gerrit.wikimedia.org/r/465383

Change 465383 merged by Ema:
[operations/puppet@production] varnishmedia: post-removal cleanup

https://gerrit.wikimedia.org/r/465383

• ema updated the task description. (Show Details)Oct 9 2018, 10:39 AM

• ema updated the task description. (Show Details)

Krinkle mentioned this in T205870: Fully migrate producers off statsd.Dec 9 2018, 2:26 AM

Krinkle unsubscribed.Dec 10 2018, 2:25 AM

Ran Timo's grafana audit script to find dashboards using remaining varnish statsd metrics, note some hits can be false positives (i.e. the metric is in the dashboard json but not displayed/hidden)

Dashboard audit for varnishstatsd (i.e. key_prefix => "varnish.${::site}.backends")

$ nodejs 01-search-all-grafana.js 'varnish\..+\.backends' | grep Matched
Matched db/api-frontend-summary (API frontend summary)
Matched db/experimental-backend-5xx (Experimental - backend 5xx)
Matched db/maps-performances (Maps performances)
Matched db/media (Media)
Matched db/wdqs-paper-data (WDQS Paper data)
Matched db/wikidata-query-service-frontend (Wikidata Query Service Frontend)

And varnishreqstats (key_prefix => "varnish.${::site}.${cache_cluster}.frontend.request"):

$ nodejs 01-search-all-grafana.js 'varnish\..+\..+\.frontend' | tee varnishreqstats_dashboards.log | grep Matched
Matched db/experimental-backend-5xx (Experimental - backend 5xx)
Matched db/interactive-team-kpi (Interactive team KPI)
Matched db/interactive-team-kpi-backup (Interactive team KPI (backup))
Matched db/julien-maps-dashboard (Julien Maps Dashboard)
Matched db/maps-dashboard-draft (Maps Dashboard - draft)
Matched db/maps-kpi (Maps KPI)
Matched db/prometheus-varnish-http-requests (Prometheus Varnish HTTP Requests)
Matched db/prometheus-varnish-http-errors-datacenters (Prometheus Varnish: HTTP Errors (datacenters))
Matched db/service-maps-varnish (Service :: Maps - Varnish)
Matched db/varnish-http-requests (Varnish HTTP Requests)
Matched db/varnish-aggregate-client-status-codes (Varnish: Aggregate Client Status Codes)
Matched db/varnish-http-errors (Varnish: HTTP Errors)
Matched db/varnish-http-errors-datacenters (Varnish: HTTP Errors (datacenters))

• Phabricator_maintenance moved this task from Backlog to Acknowledged on the SRE board.Jan 26 2019, 9:42 PM

fgiunchedi mentioned this in T220116: Migrate all metrics originated by PoPs from statsd to Prometheus.Apr 4 2019, 2:59 PM

fgiunchedi added a parent task: T220116: Migrate all metrics originated by PoPs from statsd to Prometheus.Apr 4 2019, 3:14 PM

Latest dashboard audit:

'varnish\..+\.backends'

"Media"
"API frontend summary"
"Experimental - backend 5xx"
"Maps performances"
"WDQS Paper data"
"Wikidata Query Service Frontend"

'varnish\..+\..+\.frontend'

"Varnish: HTTP Errors (datacenters)" - added deprecation warning in favor of "Prometheus Varnish: HTTP Errors (datacenters)
"Experimental - backend 5xx" - Scheduled for deletion
"Varnish: Aggregate Client Status Codes" - Recreated here, needs review

Change 519410 had a related patch set uploaded (by Cwhite; owner: Cwhite):
[operations/puppet@production] grafana: remove legacy varnish-aggregate-client-status-codes

https://gerrit.wikimedia.org/r/519410

The queries for varnishstatsd metrics I've been able to find during the audit:

(varnish.$dc.backends.be_*api_svc*.GET.sample_rate, 60)
alias(scale(varnish.$dc.backends.be_*api_svc*.POST.sample_rate, 60)
alias(scale(varnish.$dc.backends.be_*restbase_svc*.GET.sample_rate, 60)
alias(scale(varnish.$dc.backends.be_*restbase_svc*.POST.sample_rate, 60)
alias(scale(offset(asPercent(varnish.$dc.backends.be_*restbase_svc*.GET.sample_rate, 
varnish.$dc.backends.be_*restbase_svc*.GET.$percentile
varnish.$dc.backends.be_*api_svc*.GET.$percentile
maxSeries(varnish.*.backends.be_*restbase_svc*.GET.median)
maxSeries(varnish.*.backends.be_*api_svc*.GET.median)
varnish.$dc.backends.be_*restbase_svc*.POST.$percentile
varnish.$dc.backends.be_*api_svc*.POST.$percentile

varnish.eqiad.backends.*.5xx.sum
sumSeries(varnish.eqiad.backends.{be_appservers,be_api,be_restbase,be_rendering,be_appservers_debug

(varnish.*.backends.be_kartotherian_svc_*wmnet.*xx.rate, 3)
aliasByNode(averageSeries(varnish.*.backends.be_kartotherian_svc_codfw_wmnet.GET.p99), 3, 5)"
aliasByNode(averageSeries(varnish.*.backends.be_kartotherian_svc_eqiad_wmnet.GET.p99)
aliasByNode(averageSeries(varnish.*.backends.be_kartotherian_svc_codfw_wmnet.GET.p95), 3, 5)
aliasByNode(averageSeries(varnish.*.backends.be_kartotherian_svc_eqiad_wmnet.GET.p95)
"aliasByNode(averageSeries(varnish.*.backends.be_kartotherian_svc_codfw_wmnet.GET.p50), 3, 5)"
"aliasByNode(averageSeries(varnish.*.backends.be_kartotherian_svc_eqiad_wmnet.GET.p50'

'(varnish.*.backends.be_ms_fe.2xx.rate, '

'(varnish.*.backends.be_wdqs_svc*.5xx.count)"}]
integral(varnish.*.backends.be_wdqs_svc*.4xx.count)
integral(varnish.*.backends.be_wdqs_svc*.2xx.count)
integral(varnish.*.backends.be_wdqs_svc*.3xx.count'

(varnish.*.backends.be_wdqs*.[123]xx.rate
aliasByNode(exclude(varnish.*.backends.be_wdqs*.[45]xx.rate, \'wdqs100[12]\'), 3, 4)"
aliasByNode(exclude(sumSeriesWithWildcards(varnish.*.backends.be_wdqs*.*xx.rate, 4), \'be_wdqs100[12]\'), 3)"}
aliasByNode(varnish.*.backends.be_wdqs_svc*.GET.p99, 3, 4, 5)
aliasByNode(varnish.*.backends.be_wdqs_svc*.GET.p95, 3, 4, 5)
aliasByNode(varnish.*.backends.be_wdqs_svc*.GET.p50, 3, 4, 5'

In other words:

request rates, per backend and per method
request latency, per backend and per method
response rates, per backend and per status

Change 519664 had a related patch set uploaded (by Cwhite; owner: Cwhite):
[operations/puppet@production] grafana: update varnish-aggregate-client-status-codes to prometheus version

https://gerrit.wikimedia.org/r/519664

Change 519410 abandoned by Cwhite:
grafana: remove legacy varnish-aggregate-client-status-codes

Reason:
superseded by Ibb58806c2166a3200b4685e5a7cea6fb97f010f1

https://gerrit.wikimedia.org/r/519410

fgiunchedi updated the task description. (Show Details)Jul 1 2019, 9:14 AM

fgiunchedi updated the task description. (Show Details)Jul 1 2019, 9:16 AM

fgiunchedi moved this task from Radar to Doing on the User-fgiunchedi board.Jul 1 2019, 12:51 PM

Change 519664 merged by Cwhite:
[operations/puppet@production] grafana: update varnish-aggregate-client-status-codes to prometheus version

https://gerrit.wikimedia.org/r/519664

fgiunchedi updated the task description. (Show Details)Jul 2 2019, 8:23 AM

fgiunchedi added subscribers: • Mathew.onipe, MSantos.

Change 520187 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] varnish: remove varnishstatsd

https://gerrit.wikimedia.org/r/520187

fgiunchedi updated the task description. (Show Details)Jul 3 2019, 7:57 AM

fgiunchedi added a subscriber: Smalyshev.

fgiunchedi updated the task description. (Show Details)Jul 3 2019, 8:25 AM

fgiunchedi updated the task description. (Show Details)Jul 3 2019, 8:28 AM

fgiunchedi added a subscriber: Krinkle.

@Krinkle @Pchelolo according to dashboard versions you have changed the dashboard, would it be problematic if we drop "REST API Varnish hit rate (GETs, %)" until at least we have restbase req/s in prometheus?

ye, sure. we don't really monitor this on a daily basis, so there's no need for a dashboard. the number can be calculated manually if needed

In T184942#5303464, @Pchelolo wrote:

@Krinkle @Pchelolo according to dashboard versions you have changed the dashboard, would it be problematic if we drop "REST API Varnish hit rate (GETs, %)" until at least we have restbase req/s in prometheus?

ye, sure. we don't really monitor this on a daily basis, so there's no need for a dashboard. the number can be calculated manually if needed

Sounds great, thanks! I've replaced the dashboard with the one with Prometheus metrics now

fgiunchedi updated the task description. (Show Details)Jul 3 2019, 1:22 PM

@MSantos @Mathew.onipe we're moving from graphite-based varnish metrics to prometheus-based varnish metrics, I see you were amongst the authors of https://grafana.wikimedia.org/d/000000305/maps-performances, could you take a look at the prometheus version at https://grafana.wikimedia.org/d/kcAMMw4Wk/maps-performances-filippo-t184942?orgId=1 and let us know if it looks good? If so I'll replace the former with the latter. (cc @Gehel, I know Matt is away ATM)

@fgiunchedi overall it looks good, just have one question. In the Varnish response time graph, do you know why eqiad p99 values are so different? The current board has values up to 20s and the new one 5s.

In T184942#5306825, @MSantos wrote:

@fgiunchedi overall it looks good, just have one question. In the Varnish response time graph, do you know why eqiad p99 values are so different? The current board has values up to 20s and the new one 5s.

I don't know offhand, although I'd be interested to know what percentiles karthoterian sees, do you know if we have those available?

In T184942#5306856, @fgiunchedi wrote:

In T184942#5306825, @MSantos wrote:

@fgiunchedi overall it looks good, just have one question. In the Varnish response time graph, do you know why eqiad p99 values are so different? The current board has values up to 20s and the new one 5s.

I don't know offhand, although I'd be interested to know what percentiles karthoterian sees, do you know if we have those available?

Unfortunately, I don't know. Maybe @Mathew.onipe or @Gehel know it better.

Change 521427 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] monitoring: update icinga links to varnish-aggregate-client-status-codes

https://gerrit.wikimedia.org/r/521427

Change 521427 merged by Ema:
[operations/puppet@production] monitoring: update icinga links to varnish-aggregate-client-status-codes

https://gerrit.wikimedia.org/r/521427

In T184942#5306856, @fgiunchedi wrote:

I don't know offhand, although I'd be interested to know what percentiles karthoterian sees, do you know if we have those available?

As far as I know, we don't collect %-iles at kartotherian level. As for the difference in numbers, my guess is that the buckets we use (10ms, 50ms, 100ms, 500ms, 1s, 5s, +Inf) don't have much precision for requests over 5s. And since the p99 of maps was mostly >5s, we're just loosing precision and should not trust those values too much. I'm not sure how the math checks out.

It still shows that p99 on maps is way to high, but sadly, that's not really news.

fgiunchedi mentioned this in T227668: Per-backend ATS Prometheus metrics.Jul 10 2019, 1:56 PM

In T184942#5320841, @Gehel wrote:

In T184942#5306856, @fgiunchedi wrote:

I don't know offhand, although I'd be interested to know what percentiles karthoterian sees, do you know if we have those available?

As far as I know, we don't collect %-iles at kartotherian level. As for the difference in numbers, my guess is that the buckets we use (10ms, 50ms, 100ms, 500ms, 1s, 5s, +Inf) don't have much precision for requests over 5s. And since the p99 of maps was mostly >5s, we're just loosing precision and should not trust those values too much. I'm not sure how the math checks out.

It still shows that p99 on maps is way to high, but sadly, that's not really news.

Thanks for taking a look @Gehel ! I agree the difference might be due to the bucketing.

In the process we've also discovered that those metrics stopped updating both for varnish+mtail and varnishstatsd due to upload cache fully moving to ATS, getting equivalent metrics is tracked in T227668: Per-backend ATS Prometheus metrics.

Since the maps performance dashboard with graphite metrics is broken anyways ATM I think it makes sense to go ahead and remove varnishstatsd since the rest of the dashboards are migrated and are using cache text backends.

In T184942#5323915, @fgiunchedi wrote:

Since the maps performance dashboard with graphite metrics is broken anyways ATM I think it makes sense to go ahead and remove varnishstatsd since the rest of the dashboards are migrated and are using cache text backends.

Agreed, that should not be blocking anything on the maps side.

Change 520187 merged by Filippo Giunchedi:
[operations/puppet@production] varnish: remove varnishstatsd

https://gerrit.wikimedia.org/r/520187

fgiunchedi updated the task description. (Show Details)Jul 15 2019, 8:10 AM

fgiunchedi updated the task description. (Show Details)Jul 17 2019, 9:34 AM

fgiunchedi updated the task description. (Show Details)Jul 17 2019, 9:42 AM

Change 523891 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] varnish: remove varnishreqstats-based alerts

https://gerrit.wikimedia.org/r/523891

Change 523892 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] varnish: ensure varnishreqstats is absent

https://gerrit.wikimedia.org/r/523892

Change 523891 merged by Filippo Giunchedi:
[operations/puppet@production] varnish: remove varnishreqstats-based alerts

https://gerrit.wikimedia.org/r/523891

Change 523892 merged by Filippo Giunchedi:
[operations/puppet@production] varnish: ensure varnishreqstats is absent

https://gerrit.wikimedia.org/r/523892

Change 525252 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] varnish: fix varnishreqstats systemd::service usage

https://gerrit.wikimedia.org/r/525252

Change 525252 merged by Filippo Giunchedi:
[operations/puppet@production] varnish: fix varnishreqstats systemd::service usage

https://gerrit.wikimedia.org/r/525252

Change 525259 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] varnish: remove varnishreqstats and varnishstatsd

https://gerrit.wikimedia.org/r/525259

Change 525259 merged by Filippo Giunchedi:
[operations/puppet@production] varnish: remove varnishreqstats and varnishstatsd

https://gerrit.wikimedia.org/r/525259

fgiunchedi updated the task description. (Show Details)Jul 29 2019, 8:46 AM

All varnish statsd daemons have been retired, "maps performance" dashboard is missing per-backend ATS metrics which is tracked in T227668: Per-backend ATS Prometheus metrics

fgiunchedi removed a subtask: T193445: Update Media dashboard in Grafana to use Prometheus metrics.Jul 30 2019, 4:29 PM

Change 737655 had a related patch set uploaded (by Ema; author: Ema):

[operations/puppet@production] varnish: remove cachestats.py

https://gerrit.wikimedia.org/r/737655

Change 737655 merged by Ema:

[operations/puppet@production] varnish: remove cachestats.py

https://gerrit.wikimedia.org/r/737655

Change 737670 had a related patch set uploaded (by Ema; author: Ema):

[operations/puppet@production] varnish::logging: remove statsd_host and mtail_progs

https://gerrit.wikimedia.org/r/737670

Change 737670 merged by Ema:

[operations/puppet@production] varnish::logging: remove statsd_host and mtail_progs

https://gerrit.wikimedia.org/r/737670

Deprecate python varnish cachestats
Closed, ResolvedPublic
Actions

Description

varnishstatsd

varnishreqstats

Details

Related Objects
Search...

Event Timeline

	F18055236: Screen Shot 2018-05-07 at 18.03.53.png
	May 7 2018, 5:04 PM

	F18055250: Screen Shot 2018-05-07 at 18.04.08.png
	May 7 2018, 5:04 PM

	fgiunchedi
	Jan 15 2018, 5:48 PM

Deprecate python varnish cachestatsClosed, ResolvedPublicActions