Background
The MetricsPlatform extension fetches instrument configs from an MPIC instance running on dse-k8s. During initial deployment at least, we should be actively monitoring the performance of this system.
At any time we will want to know:
- The number of MPIC API requests being made
- The number of MPIC API requests succeeding
- The number of MPIC API requests failing
- The number of MPIC API requests timing out
- The number of malformed API responses
- The median, p75, p95, p99 round-trip-time (RTT) of the API requests
AC
- The above are instrumented
- A Grafana dashboard is created
Requirements
- QA passed?
- Documentation
- https://www.mediawiki.org/wiki/Extension:MetricsPlatform should be updated
Notes
- The number of MPIC API requests succeeding
- The number of MPIC API requests failing
- The number of MPIC API requests timing out
We should use the Stats library provided by MediaWiki Core to do this, e.g.
if ( $isSuccess ) { $label = 'success'; } else if ( $isTimeout ) { $label = 'timeout'; } else { $label = 'failure'; } MediaWikiServices::getInstance()->getStatsFactory() ->withComponent( 'MetricsPlatform' ) ->getCounter( 'mpic_api_request_total' ) ->setLabel( $label ) ->increment();
- * The median, p75, p95, p99 round-trip-time (RTT) of the API requests
$startTime = microtime( true ); // ... MediaWikiServices::getInstance()->getStatsFactory() ->withComponent( 'MetricsPlatform' ) ->getTiming( 'mpic_api_request_duration_seconds' ) ->observe( ( microtime( true ) - $startTime ) * 1000 );