Page MenuHomePhabricator

Improve data logging on Special:Diff and Special:MobileDiff
Closed, ResolvedPublic

Description

To evaluate the changes we're planning to make to the mobile design of diffs, we need to first instrument some data logging on Special:Diff and Special:MobileDiff, so that we understand how user behaviour changes as a result. This task will track those data logging tasks.

We would like to track clicks to:

  • Actions
    • Undo
    • Thanks
    • Rollback
    • Change visibility
  • Revision navigation links
    • User page
    • Page history
    • Contributions
    • Add to watchlist
    • Previous/Next edit
    • Tags; both the Special:Tags button and individual hyperlinked tags

Metrics Platform

For these events, we can use the Metrics Platform, which has a default schema tracking a variety of data that we're interested in.

Event Timeline

@mpopov shared some information about what is (and mostly isn't) being logged currently:

I don't think there's any usage tracking – even page views/visits for that would need to be extracted manually from webrequest data. I remember the Community Tech team adding VisualEditor-powered visual diffs to that, but I don't think they instrumented any analytics with that.
iOS has in-app page history & diff viewing with instrumentation but I don't think there's anything that data can tell you about special:diff usage on the web.

I searched for "diff" on datahub.wikimedia.org and codesearch.wmcloud.org/search/ and grafana.wikimedia.org, and that yielded:

but there's no page or revision info, so there's no way to see how diffing performs for different diffs.

This is a perfect use-case for Metrics Platform (see https://wikitech.wikimedia.org/wiki/Metrics_Platform/FAQ and T315091#8311847) and wouldn't require making a new schema or implementing a lot of basics (refer to https://wikitech.wikimedia.org/wiki/Metrics_Platform/Event_Schema for an overview of pieces of data that Metrics Platform takes care for you).

cc @EChetty @phuedx @cjming


On a separate note, you may want to consider what contextual data would be helpful to collect when the user performs those actions. Less things like "username or use ID of the user" or "user's edit count at the time of the interaction" (since that'd be taken care of by Metrics Platform) and more:

  • the version of the diff page (e.g. 'legacy', 'new-variant-a', 'new-variant-b')
  • the revision ID of the revision being acted on (thanked, undone, etc.)

This is a perfect use-case for Metrics Platform (see https://wikitech.wikimedia.org/wiki/Metrics_Platform/FAQ and T315091#8311847) and wouldn't require making a new schema or implementing a lot of basics (refer to https://wikitech.wikimedia.org/wiki/Metrics_Platform/Event_Schema for an overview of pieces of data that Metrics Platform takes care for you).

I believe @eigyan investigated this as an option with @phuedx for T310852 and we decided against it - is that accurate Essex? Could you elaborate on why we decided against using the Metrics Platform?

Greetings @Samwalton9 yes we did look into the metrics platform to help with testing/QA the platform. During this time we discovered an issue with config setup and had a pretty tight deadline, so we decided to move to the already established web_tracking instrument. Now that the Metrics Platform is ready I don't see why if couldn't suit our current tracking needs.

Greetings @Samwalton9 yes we did look into the metrics platform to help with testing/QA the platform. During this time we discovered an issue with config setup and had a pretty tight deadline, so we decided to move to the already established web_tracking instrument. Now that the Metrics Platform is ready I don't see why if couldn't suit our current tracking needs.

Thanks for this summary, @eigyan. FWIW our experience with the config setup led me to create this page https://wikitech.wikimedia.org/wiki/Metrics_Platform/Creating_a_Stream_Configuration, which (hopefully) fully covers deploying a new stream – from Beta-Cluster-only to per-wiki tweaks.

On a separate note, you may want to consider what contextual data would be helpful to collect when the user performs those actions. Less things like "username or use ID of the user" or "user's edit count at the time of the interaction" (since that'd be taken care of by Metrics Platform) and more:

  • the version of the diff page (e.g. 'legacy', 'new-variant-a', 'new-variant-b')
  • the revision ID of the revision being acted on (thanked, undone, etc.)

Thanks for the links and advice @mpopov! Would page.revision_id be giving us the latter on Special:Diff, or would we need to add it additionally?

Would page.revision_id be giving us the latter on Special:Diff, or would we need to add it additionally?

page.revision_id is the revision ID of the page at the time of the event, so it would be blank for Special pages. You would need to capture the rev ID of the target at click time and include that separately if you want that info. However, I don't think that you actually need that for your purposes. Of the two ideas I listed you only really need to include the diff page version so that you can later compare how certain interaction rates (e.g. how frequently editors thank) differ between the UI/UX variants.

Would page.revision_id be giving us the latter on Special:Diff, or would we need to add it additionally?

page.revision_id is the revision ID of the page at the time of the event, so it would be blank for Special pages. You would need to capture the rev ID of the target at click time and include that separately if you want that info. However, I don't think that you actually need that for your purposes. Of the two ideas I listed you only really need to include the diff page version so that you can later compare how certain interaction rates (e.g. how frequently editors thank) differ between the UI/UX variants.

Ahh yep, that makes sense, thanks. And agree that we probably don't need that extra data.

Change 936748 had a related patch set uploaded (by Jsn.sherman; author: Jsn.sherman):

[operations/mediawiki-config@master] log additional events on Special:Diff|MobileDiff

https://gerrit.wikimedia.org/r/936748

Change 936748 merged by jenkins-bot:

[operations/mediawiki-config@master] log additional events on Special:Diff|MobileDiff

https://gerrit.wikimedia.org/r/936748

Mentioned in SAL (#wikimedia-operations) [2023-07-10T20:02:13Z] <samtar@deploy1002> Started scap: Backport for [[gerrit:936748|log additional events on Special:Diff|MobileDiff (T326212)]]

Mentioned in SAL (#wikimedia-operations) [2023-07-10T20:03:35Z] <samtar@deploy1002> samtar and jsn: Backport for [[gerrit:936748|log additional events on Special:Diff|MobileDiff (T326212)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-07-10T20:23:56Z] <samtar@deploy1002> Finished scap: Backport for [[gerrit:936748|log additional events on Special:Diff|MobileDiff (T326212)]] (duration: 21m 42s)

Change 937096 had a related patch set uploaded (by Jsn.sherman; author: Jsn.sherman):

[operations/mediawiki-config@master] log additional events on Special:Diff|MobileDiff

https://gerrit.wikimedia.org/r/937096

Change 937096 merged by jenkins-bot:

[operations/mediawiki-config@master] log additional events on Special:Diff|MobileDiff

https://gerrit.wikimedia.org/r/937096

Mentioned in SAL (#wikimedia-operations) [2023-07-12T20:22:31Z] <samtar@deploy1002> Started scap: Backport for [[gerrit:937096|log additional events on Special:Diff|MobileDiff (T326212)]]

Mentioned in SAL (#wikimedia-operations) [2023-07-12T20:24:04Z] <samtar@deploy1002> jsn and samtar: Backport for [[gerrit:937096|log additional events on Special:Diff|MobileDiff (T326212)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-07-12T20:49:13Z] <samtar@deploy1002> Finished scap: Backport for [[gerrit:937096|log additional events on Special:Diff|MobileDiff (T326212)]] (duration: 26m 41s)

Change 937096 had a related patch set uploaded (by Jsn.sherman; author: Jsn.sherman):

[operations/mediawiki-config@master] log additional events on Special:Diff|MobileDiff

https://gerrit.wikimedia.org/r/937096

This patch was followed up with https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/937527 to fix the configuration; upon deployment, the instrumentation appears to be working.