Page MenuHomePhabricator

Send hCaptcha API response data to event platform
Closed, ResolvedPublic

Description

Summary

Record part of hCaptcha API response data to the Wikimedia Event Platform during account creation

Background

  • hCaptcha response data risk signal should be stored in the user's global session
  • The normed risk signal can be retrieved in extensions/Campaigns/includes/CampaignsSecondaryAuthenticationProvider.php and included in in ServerSideAccountCreation event data.
  • the ServerSideAccountCreation event must opt out of sending IP and user-agent data.

User story

As staff, I want to send hCaptcha response data to the Event Platform during account creation to support analytics, while excluding IP and user-agent from this event stream.

Acceptance criteria

  • hCaptcha risk signal stored in global session
  • risk signal included in ServerSideAccountCreation event

Related ticket: T377341: hCaptcha: Log results returned from backend API

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
phuedx added a subscriber: VirginiaPoundstone.

@VirginiaPoundstone: This seems like a good candidate for a tightly-scoped instrument using either the JS or PHP Metrics Platform Client. What do you think?

@phuedx could be. I need a little more context. Let's talk about it during grooming.

@acooper, what would be the producer of the captcha events? Is it code we write and own? If so, this should be easy. If not, we have ways, but it kind of depends on the structure of the events sent.

@acooper, what would be the producer of the captcha events? Is it code we write and own? If so, this should be easy. If not, we have ways, but it kind of depends on the structure of the events sent.

Yes, it is code we manage in ConfirmEdit extension. The idea is that in the PageSaveCompleteHook, we'd log an event that also includes the hCaptcha score (if any) for a given request. We'd want to save performer ID, revision ID and bucketed score (not the raw score). For account creations, we'd omit revision ID.

@Reedy @acooper In extensions/ConfirmEdit/includes/hCaptcha/HCaptcha.php, you should be able to set the normed score in the user's global session (e.g. SessionManager::getGlobalSession()->set( 'hCaptcha', $normedScore );), and then retrieve it in Extension:Campaigns (extensions/Campaigns/includes/CampaignsSecondaryAuthenticationProvider.php) to include it in the event data sent to ServerSideAccountCreation. As noted by @nettrom_WMF, we'll need to update the code logging to ServerSideAccountCreation to opt out of including IP and user-agent data. We want both IP and user agent data for analysis, but we can use MediaWiki-extensions-IPReputation's event data on account creation for that.

Change #1127937 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/extensions/ConfirmEdit@master] HCaptcha: Save hCaptcha score in global session

https://gerrit.wikimedia.org/r/1127937

Change #1127940 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/extensions/Campaigns@master] CampaignsSecondaryAuthenticationProvider: Captcha hCaptcha-score and include in ServerSideAccountCreation event

https://gerrit.wikimedia.org/r/1127940

reedy opened https://gitlab.wikimedia.org/repos/data-engineering/schemas-event-secondary/-/merge_requests/49

Draft: Add hCaptchaScore and create version 1.4.0 of analytics/legacy/serversideaccountcreation

acooper renamed this task from Send captcha API response data to event logging to Send hCaptcha API response data to event platform.Mar 21 2025, 2:35 PM
acooper changed the task status from Open to In Progress.
acooper triaged this task as Medium priority.

kharlan merged https://gitlab.wikimedia.org/repos/data-engineering/schemas-event-secondary/-/merge_requests/49

Add hCaptchaScore and create version 1.4.0 of analytics/legacy/serversideaccountcreation

Change #1127940 merged by jenkins-bot:

[mediawiki/extensions/Campaigns@master] CampaignsSecondaryAuthenticationProvider: Capture hCaptcha-score and include in ServerSideAccountCreation event

https://gerrit.wikimedia.org/r/1127940

Change #1127937 merged by jenkins-bot:

[mediawiki/extensions/ConfirmEdit@master] HCaptcha: Save hCaptcha score in global session

https://gerrit.wikimedia.org/r/1127937

This is pending some discussions with hCaptcha before we can re-enable.

@Reedy @acooper In extensions/ConfirmEdit/includes/hCaptcha/HCaptcha.php, you should be able to set the normed score in the user's global session (e.g. SessionManager::getGlobalSession()->set( 'hCaptcha', $normedScore );), and then retrieve it in Extension:Campaigns (extensions/Campaigns/includes/CampaignsSecondaryAuthenticationProvider.php) to include it in the event data sent to ServerSideAccountCreation. As noted by @nettrom_WMF, we'll need to update the code logging to ServerSideAccountCreation to opt out of including IP and user-agent data. We want both IP and user agent data for analysis, but we can use MediaWiki-extensions-IPReputation's event data on account creation for that.

I don't remember why we said we needed to opt out of including IP and user-agent data. I think we should leave this as is.

I don't remember why we said we needed to opt out of including IP and user-agent data. I think we should leave this as is.

I think that's a remnant from our early conversations about what the associated risk tier would be and whether we were aiming for low risk to reduce delays from having to go through full L3SC review. Please do correct me if I'm wrong, but I seem to remember that we've been through L3SC review now and gotten signoff on the associated risk, and therefore we don't need to make any changes.

Change #1182824 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[operations/mediawiki-config@master] hCaptcha: Enable processing of the risk score

https://gerrit.wikimedia.org/r/1182824

kostajh changed the task status from Stalled to In Progress.Aug 28 2025, 12:31 PM
kostajh updated Other Assignee, added: kostajh.
kostajh removed a subscriber: acooper.

Change #1182824 merged by jenkins-bot:

[operations/mediawiki-config@master] hCaptcha: Enable processing of the risk score

https://gerrit.wikimedia.org/r/1182824

Mentioned in SAL (#wikimedia-operations) [2025-08-28T12:35:09Z] <kharlan@deploy1003> Started scap sync-world: Backport for [[gerrit:1182824|hCaptcha: Enable processing of the risk score (T379179)]]

Mentioned in SAL (#wikimedia-operations) [2025-08-28T12:41:48Z] <kharlan@deploy1003> kharlan: Backport for [[gerrit:1182824|hCaptcha: Enable processing of the risk score (T379179)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-08-28T12:49:22Z] <kharlan@deploy1003> Finished scap sync-world: Backport for [[gerrit:1182824|hCaptcha: Enable processing of the risk score (T379179)]] (duration: 14m 13s)

Change #1182844 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/ConfirmEdit@master] hCaptcha: Set a default value for remoteip

https://gerrit.wikimedia.org/r/1182844

Change #1182866 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/ConfirmEdit@wmf/1.45.0-wmf.16] hCaptcha: Set a default value for remoteip

https://gerrit.wikimedia.org/r/1182866

Change #1182866 merged by jenkins-bot:

[mediawiki/extensions/ConfirmEdit@wmf/1.45.0-wmf.16] hCaptcha: Set a default value for remoteip

https://gerrit.wikimedia.org/r/1182866

Mentioned in SAL (#wikimedia-operations) [2025-08-28T15:30:53Z] <kharlan@deploy1003> Started scap sync-world: Backport for [[gerrit:1182866|hCaptcha: Set a default value for remoteip (T379179)]]

Mentioned in SAL (#wikimedia-operations) [2025-08-28T15:34:56Z] <kharlan@deploy1003> kharlan: Backport for [[gerrit:1182866|hCaptcha: Set a default value for remoteip (T379179)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Change #1182844 merged by jenkins-bot:

[mediawiki/extensions/ConfirmEdit@master] hCaptcha: Set a default value for remoteip

https://gerrit.wikimedia.org/r/1182844

Mentioned in SAL (#wikimedia-operations) [2025-08-28T15:41:17Z] <kharlan@deploy1003> Finished scap sync-world: Backport for [[gerrit:1182866|hCaptcha: Set a default value for remoteip (T379179)]] (duration: 10m 24s)

kostajh updated the task description. (Show Details)