Page MenuHomePhabricator

MetricsPlatform: Add performance instrumentation
Closed, ResolvedPublic

Description

Background

The MetricsPlatform extension fetches instrument configs from an MPIC instance running on dse-k8s. During initial deployment at least, we should be actively monitoring the performance of this system.

At any time we will want to know:

  • The number of MPIC API requests being made
  • The number of MPIC API requests succeeding
  • The number of MPIC API requests failing
  • The number of MPIC API requests timing out
  • The number of malformed API responses
  • The median, p75, p95, p99 round-trip-time (RTT) of the API requests

AC

  • The above are instrumented
  • A Grafana dashboard is created

Requirements

Notes

  1. Read https://www.mediawiki.org/wiki/Manual:Stats and https://prometheus.io/docs/practices/naming/.
  • The number of MPIC API requests succeeding
  • The number of MPIC API requests failing
  • The number of MPIC API requests timing out

We should use the Stats library provided by MediaWiki Core to do this, e.g.

if ( $isSuccess ) {
  $label = 'success';
} else if ( $isTimeout ) {
  $label = 'timeout';
} else {
  $label = 'failure';
}

MediaWikiServices::getInstance()->getStatsFactory()
  ->withComponent( 'MetricsPlatform' )
  ->getCounter( 'mpic_api_request_total' )
  ->setLabel( $label )
  ->increment();
  • * The median, p75, p95, p99 round-trip-time (RTT) of the API requests
$startTime = microtime( true );

// ...

MediaWikiServices::getInstance()->getStatsFactory()
  ->withComponent( 'MetricsPlatform' )
  ->getTiming( 'mpic_api_request_duration_seconds' )
  ->observe( ( microtime( true ) - $startTime ) * 1000 );

Event Timeline

VirginiaPoundstone raised the priority of this task from Medium to High.Jul 15 2024, 3:28 PM

Change #1056061 had a related patch set uploaded (by Clare Ming; author: Clare Ming):

[mediawiki/extensions/MetricsPlatform@master] Add performance instrumentation to Metrics Platform

https://gerrit.wikimedia.org/r/1056061

Change #1056061 merged by jenkins-bot:

[mediawiki/extensions/MetricsPlatform@master] Add performance instrumentation to Metrics Platform

https://gerrit.wikimedia.org/r/1056061

cjming moved this task from Done to To Deploy on the Test Kitchen (Data products Sprint 18) board.

riding the train this week - once it's deployed everywhere I'll create the dashboard