Page MenuHomePhabricator

Attribution Research: Instrument pageviews
Open, Needs TriagePublic

Description

The authoritative document is the Instrumentation Spec, will copy here summarized relevant detail.

Event Timeline

Typing out loud the rationalization for the instrumentation here:

  • page_load - sent only if logged out
    • attributes: page_id, page_namespace_id, client_platform, mediawiki_database will be processed before landing in HDFS
    • action_context = location.referrer also to be processed before landing in HDFS
      • page_id will be joined to page popularity and turned into a boolean
      • page_namespace_id will be grouped to get a general idea of the type of page
      • client_platform will be used to tell apart mobile and desktop (apps are not sending instrumentation right now)
      • location.referrer will be parsed and classified server-side before landing in HDFS
  • edit_attempt - sent only if logged out
    • attributes: none? (we just want to know that this subject_id tried to edit at all)
  • named_registration and temporary_registration - sent when logged in, only once on the first page render after account creation
    • attributes: user_id, mediawiki_database will be used to connect the user_id with the low granularity details saved in reading events (above)
  • edit_by_temporary - sent only once if a temp user saves an edit
    • attributes: user_id, mediawiki_database will be used to connect the temporary user with the low granularity details saved in reading events (above)
  • erase_subject - sent if logged in only once
    • attributes: none - if this is sent and no registration event is associated with this user, then the user had an account prior to the experiment started and was logged out when we first saw them. So we remove their data from the final analysis.

The user_ids will also be used to connect potential future editing careers with the high level attributes/buckets of readers as processed from the reading events.

The following scenarios have been considered:

has local accounthas global accountis logged inNOTES
YYYon first render after account creation, send named_registration or temporary_registration; if not first render after account creation, send erase_subject to discount from experiment
YYNsend page_load, send edit_attempt, send edit_by_temporary because we don't know about their accounts if they don't log in
YN*(not possible, global accounts are created automatically)
NYY(not possible, if you're logged into a global account, you're logged in locally too)
NYNsend page_load, send edit_attempt, send edit_by_temporary because we don't know about their accounts if they don't log in
NNY(not possible to login without an account)
NNNsend page_load, send edit_attempt, send edit_by_temporary, this is the simplest case

Change #1248580 had a related patch set uploaded (by Milimetric; author: Milimetric):

[mediawiki/extensions/WikimediaEvents@master] Implement instrumentation plan as defined in task

https://gerrit.wikimedia.org/r/1248580

Change #1250249 had a related patch set uploaded (by TChin; author: TChin):

[operations/mediawiki-config@master] Add stream config for attribution research

https://gerrit.wikimedia.org/r/1250249

Change #1250249 merged by jenkins-bot:

[operations/mediawiki-config@master] Add stream config for attribution research

https://gerrit.wikimedia.org/r/1250249

Change #1248580 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] ext.wikimediaEvents: Add instrumentation for attribution research

https://gerrit.wikimedia.org/r/1248580

Change #1255763 had a related patch set uploaded (by Milimetric; author: Milimetric):

[operations/mediawiki-config@master] testKitchen: Add custom stream name

https://gerrit.wikimedia.org/r/1255763

Change #1255769 had a related patch set uploaded (by Milimetric; author: Milimetric):

[mediawiki/extensions/WikimediaEvents@master] ext.wikimediaEvents: Fix attribution experiment sending

https://gerrit.wikimedia.org/r/1255769

Change #1256486 had a related patch set uploaded (by Milimetric; author: Milimetric):

[mediawiki/extensions/WikimediaEvents@master] Add tick instrumentation to attribution research

https://gerrit.wikimedia.org/r/1256486

Change #1255769 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] ext.wikimediaEvents: Fix attribution experiment sending

https://gerrit.wikimedia.org/r/1255769

Change #1256486 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] Add tick instrumentation to attribution research

https://gerrit.wikimedia.org/r/1256486

Change #1255763 merged by jenkins-bot:

[operations/mediawiki-config@master] testKitchen: Add custom stream name

https://gerrit.wikimedia.org/r/1255763

Mentioned in SAL (#wikimedia-operations) [2026-03-23T20:34:43Z] <dani@deploy2002> Started scap sync-world: Backport for [[gerrit:1254448|Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450|Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452|Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763|testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120|Enable wgCampaignEventsEnableEventGoals in

Mentioned in SAL (#wikimedia-operations) [2026-03-23T20:36:38Z] <dani@deploy2002> milimetric, daimona, dani: Backport for [[gerrit:1254448|Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450|Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452|Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763|testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120|Enable wgCampaignEventsEnableEventGoals i

Mentioned in SAL (#wikimedia-operations) [2026-03-23T20:42:10Z] <dani@deploy2002> Finished scap sync-world: Backport for [[gerrit:1254448|Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450|Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452|Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763|testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120|Enable wgCampaignEventsEnableEventGoals in