Page MenuHomePhabricator

Enable experiment enrollment in the MediaWiki Action API
Closed, ResolvedPublic2 Estimated Story Points

Description

This task covers:

This task does not cover enabling experiment enrollment in the REST API.

AC

  • The user is enrolled into experiments as early as possible during a request to the MediaWiki Action API endpoint
  • Experiment implementors can retrieve experiment enrollment information via the MetricsPlatform.XLab.ExperimentManager in MediaWiki Action API modules

Notes

  1. Currently, we initialize the MetricsPlatform.XLab.ExperimentManager service in the BeforeInitialize hook handler. We should do the same in an ApiBeforeMain hook handler. We can rely on both hook handlers not being executed during the same request

Open Questions

  1. Should we decorate the output of the MediaWiki Action API with enrollment details for the experiments that were retrieved during the request?

Yes

Event Timeline

phuedx triaged this task as High priority.Thu, Nov 13, 6:18 AM
phuedx updated the task description. (Show Details)
phuedx moved this task from Incoming to READY TO GROOM on the Test Kitchen board.

Change #1204815 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/extensions/MetricsPlatform@master] Enable experiment enrollment in the MediaWiki Action API

https://gerrit.wikimedia.org/r/1204815

Regarding the open question above:

Should we decorate the output of the MediaWiki Action API with enrollment details for the experiments that were retrieved during the request?

I was wondering if just transparency would be a valid reason to do it. Something like adding some header in the response with the experiment and variant where the request has been enrolled so that consumers know when their request are running as a part of an A/B test.

Something like adding some header in the response with the experiment and variant where the request has been enrolled so that consumers know when their request are running as a part of an A/B test.

I've added an APIAfterExecute hook handler which sets the X-Experiment-Enrollments header but I'm curious if Varnish will do anything about it (remembering that Varnish sets it on the way in).

Change #1204815 merged by jenkins-bot:

[mediawiki/extensions/MetricsPlatform@master] Enable experiment enrollment in the MediaWiki Action API

https://gerrit.wikimedia.org/r/1204815

Milimetric set the point value for this task to 2.Tue, Nov 25, 3:54 PM

The EventGate validation errors dashboard is showing a decent chunk of these sorts of errors:

"message": "'' should NOT have additional properties",
"$schema": "/error/2.1.0",
"tags": [
  "input-kafka-eventgate-analytics-validation-error-eqiad",
  "kafka",
  "es",
  "eventgate",
  "normalized_message_untrimmed"
],
"emitter_id": "eventgate-production",
"errored_schema_uri": "/mediawiki/api/request/1.0.0",
"dt": "2025-11-26T05:35:09.027Z",
"error_type": "ValidationError",
"normalized_message": "'' should NOT have additional properties",

Thanks @tchin for flagging.

This appears due to the call chain of ApiMain's logRequest where

wfDebugLog( 'api-request', '', 'private', $logCtx );

ultimately results in injecting the context into the message destined for EventBus (in addition to Logstash events as seen if expanding entries in this OpenSearch dashboard snapshot query).

The approximate call chain doing this injection is.

LoggingContext.php:65, MediaWiki\Logger\LoggingContext->get()
ContextProcessor.php:18, MediaWiki\Logger\Monolog\ContextProcessor->__invoke()
Logger.php:388, Monolog\Logger->addRecord()
Logger.php:611, Monolog\Logger->info()
GlobalFunctions.php:733, wfDebugLog()
ApiMain.php:2155, MediaWiki\Api\ApiMain->logRequest()
ApiMain.php:949, MediaWiki\Api\ApiMain->executeActionWithErrorHandling()
ApiMain.php:918, MediaWiki\Api\ApiMain->execute()
ApiEntryPoint.php:138, MediaWiki\Api\ApiEntryPoint->execute()

One option to allow us to support A/B testing on the Action API right now would be to not add the context.ab_tests data to the logging context until we have figured out a nice way to exclude it from being a part of the /mediawiki/api/request/1.0.0 stream. Another could be to rev the stream (although that requires patching core) so the field is supported, although the intent of context.ab_tests was really for operators looking in Logstash, not for downstream data processing. It looks like we're getting maybe 10K of these messages every 15 minutes right now.

Change #1211604 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/extensions/MetricsPlatform@master] Hooks: Only add global logging context for pageviews

https://gerrit.wikimedia.org/r/1211604

Thanks @dr0ptp4kt for the additional info. The patch above should fix this for now, while we work on the correct fix for the issue.

The approximate call chain doing this injection is. <snip />

The call chain is correct enough but there's some config-driven indirection here that should be captured:

The api-request Monolog channel is defined here. You'll notice that the eventbus logger is set to debug level.

Aside: The documentation for this config variable is incorrect. The eventbus logger is disabled by default.

Monolog loggers are initialized in mediawiki-config/wmf-config/logging.php on a per-channel basis. The eventbus logger for the api-request channel will be initialized here.

Change #1211611 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/extensions/MetricsPlatform@wmf/1.46.0-wmf.4] Hooks: Only add global logging context for pageviews

https://gerrit.wikimedia.org/r/1211611

Change #1211604 merged by jenkins-bot:

[mediawiki/extensions/MetricsPlatform@master] Hooks: Only add global logging context for pageviews

https://gerrit.wikimedia.org/r/1211604

Change #1211611 merged by jenkins-bot:

[mediawiki/extensions/MetricsPlatform@wmf/1.46.0-wmf.4] Hooks: Only add global logging context for pageviews

https://gerrit.wikimedia.org/r/1211611

Mentioned in SAL (#wikimedia-operations) [2025-11-26T09:34:02Z] <phuedx@deploy2002> Started scap sync-world: Backport for [[gerrit:1211611|Hooks: Only add global logging context for pageviews (T409965 T411074)]]

Mentioned in SAL (#wikimedia-operations) [2025-11-26T09:36:17Z] <phuedx@deploy2002> phuedx: Backport for [[gerrit:1211611|Hooks: Only add global logging context for pageviews (T409965 T411074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-11-26T09:47:27Z] <phuedx@deploy2002> Finished scap sync-world: Backport for [[gerrit:1211611|Hooks: Only add global logging context for pageviews (T409965 T411074)]] (duration: 13m 29s)