Page MenuHomePhabricator

Audit legacy mediawiki stats used in production dashboards
Closed, ResolvedPublic

Description

The team has suggested a migration that is driven by value. To facilitate this migration, we will use this task to keep track of a list of metrics that are currently in use, which we define as being used in dashboards in Grafana.

Our objective is to generate a list of dashboards that need to be converted, which will serve as a guide during the migration process. We will link these metrics to their respective positions in the queue in the subsequent tasks, and use this task as a prioritized list for the conversion process.

  • scripted audit of dashboards using graphite datasources, emit metrics used
    • establish initial set of tracking metrics
  • identify mechanism to send metrics from script to prometheus (e.g. pushgateway)
  • create graphite metric status dashboard

Top 10 metrics used in dashboards from one time audit (full list P54396)

26 MediaWiki.timing.editResponseTime
14 mw.performance.save
 8 MediaWiki.RevisionSlider.timing.init
 7 MediaWiki.Parsoid.html2wt.setup
 7 MediaWiki.Parsoid.html2wt.selser.serialize
 7 MediaWiki.Parsoid.html2wt.selser.domDiff
 7 MediaWiki.Parsoid.html2wt.init
 6 MediaWiki.wikibase.quality.constraints.type.php.success.entities
 6 MediaWiki.Parsoid.html2wt.total
 6 MediaWiki.Parsoid.html2wt.timePerInputKB

Related Objects

StatusSubtypeAssignedTask
OpenNone
Resolvedherron
OpenNone
DuplicateDAlangi_WMF
DuplicateNone
DuplicateNone
OpenJgiannelos
DeclinedKrinkle
Resolvedcolewhite
ResolvedDAlangi_WMF
Resolvedfgiunchedi
ResolvedClement_Goubert
Resolvedcolewhite
In Progresscolewhite
Resolvedcolewhite
DuplicateNone
OpenNone
ResolvedJgiannelos
ResolvedDAlangi_WMF
OpenNone
In Progressandrea.denisse
Invalidandrea.denisse
Resolvedlarissagaulia
Resolvedcolewhite
ResolvedTarrow
Resolvedcolewhite
In Progresscolewhite
Resolvedcolewhite
OpenNone
ResolvedAnnWF
ResolvedLucas_Werkmeister_WMDE
OpenNone
Resolvedandrea.denisse
OpenNone
OpenNone
OpenNone
DuplicateNone
Resolvedandrea.denisse
ResolvedTK-999
ResolvedDAlangi_WMF
OpenNone
OpenNone
Resolvedcolewhite
Opencolewhite
Resolvedcolewhite
OpenNone
Resolvedcolewhite
Resolvedcolewhite
Resolvedcolewhite
OpenNone
Resolvedandrea.denisse
OpenNone
Resolvedandrea.denisse
OpenNone
DuplicateNone
Resolvedtappof
ResolvedNone
ResolvedAnnWF
Resolvedtappof
OpenNone
OpenNone
Resolvedcolewhite
OpenNone
Resolvedcolewhite
ResolvedTarrow
InvalidNone
Resolvedcolewhite
OpenNone
Resolvedcolewhite
Resolvedcolewhite
DuplicateNone
OpenNone
OpenCyndymediawiksim
OpenSgs
OpenCyndymediawiksim
OpenSgs
OpenSgs
OpenCyndymediawiksim
OpenNone
DuplicateNone
DuplicateNone
DuplicateNone
DuplicateNone
OpenDreamy_Jazz
Resolvedcolewhite
OpenNone
DuplicateNone
DuplicateNone
Resolvedcolewhite
DuplicateNone
Resolved codebug
ResolvedTK-999
Resolvedlarissagaulia
OpenNone
DuplicateNone
ResolvedAnnWF
ResolvedJgiannelos
Resolvedcolewhite
ResolvedTK-999
DuplicateNone
DeclinedNone
Resolvedcolewhite
DuplicateNone
Resolvedcolewhite
OpenNone
ResolvedSecuritycolewhite
Resolvedcolewhite
OpenNone
OpenNone
StalledJgiannelos
Resolvedcolewhite
DuplicateNone
InvalidNone
OpenJgiannelos
OpenJgiannelos
InvalidNone
StalledJgiannelos
DuplicateNone
Resolvedlmata
DuplicateNone
InvalidJgiannelos
DuplicateJgiannelos
StalledJgiannelos
StalledJgiannelos
OpenJgiannelos
OpenNone
ResolvedAnnWF
InvalidNone
DeclinedNone
OpenNone
ResolvedAnnWF
InvalidNone
ResolvedFGoodwin
OpenJgiannelos
OpenNone
DuplicateNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenJgiannelos
OpenJgiannelos
Resolvedcolewhite
ResolvedNone
Resolvedtappof
OpenNone
OpenNone
Resolvedtappof
OpenNone
OpenNone
OpenNone
DuplicateNone
Resolvedcolewhite
OpenNone
OpenNone

Event Timeline

herron renamed this task from Audit & convert stats in use in production to statslib to Audit legacy mediawiki stats used in production.Nov 9 2023, 2:46 PM
herron renamed this task from Audit legacy mediawiki stats used in production to Audit legacy mediawiki stats used in production dashboards.
herron triaged this task as Medium priority.

I spent some time today experimenting with https://github.com/grafana/cortex-tools, specifically cortextool analyse grafana which looked promising, but unfortunately throws parse errors when it encounters a period in the metric name which makes it not suitable for graphite metrics.

So instead I've been working on a simple script to walk the dashboard api looking for dashboards with graphite datasource, and output the metrics used. However, instead of producing a one time/manual report here I'm thinking we should build some ongoing status reporting.

I'm thinking the next step here is to expand the script to output a few metrics that capture the ongoing state of graphite utilization to something like prometheus push gateway, and build a status dashboard using these metrics. With T350825 we could possibly annotate panels with relevant commits as well. I'll expand the task description to include high level steps for that.

Very draft metric list (to be expanded/refined/clarified)

  • Dashboards using graphite datasource
  • Annotations using graphite datasource
  • Panels using graphite datasource
  • Graphite metric count

Change 980048 had a related patch set uploaded (by Herron; author: Herron):

[operations/puppet@production] grafana: add dashboard graphite usage exporter

https://gerrit.wikimedia.org/r/980048

herron updated the task description. (Show Details)

Change 980048 merged by Herron:

[operations/puppet@production] grafana: add dashboard datasource usage (graphite) exporter

https://gerrit.wikimedia.org/r/980048

herron closed this task as Resolved.EditedJan 17 2024, 8:47 PM

A custom grafana graphite datasource exporter, and a grafana dashboard using these metrics to outline current graphite datasource utilization have been deployed.

This will let us track real-time utilization in terms of how many dashboards and panels are still actively using the legacy graphite datasource (metrics updated hourly)

Dashboard is located at https://grafana.wikimedia.org/d/K6DEOo5Ik/grafana-graphite-datasource-utilization

With that I think we're done here!