Page MenuHomePhabricator

Fix the data collection for ReferencePreviews
Closed, ResolvedPublic

Description

As stated in https://phabricator.wikimedia.org/T347758#9416599 the data we gather is not ideal so to say.

I see no really good way to retrospectively make sense out of these numbers because the flaws in their collection seem to impactful. We might want to create something new ( although the schemas seem fine ). At least we should have a clear cut after we changed the logic that triggers the logging.

Acceptance criteria:

Fix the event logging by:

  • Making sure we don't have race condition that depends on the order if different scripts to load
  • Take conflicting gadgets into account when counting events for the disable bucket
  • Make clear when the "new" logging started
  • Re-think where we want to apply sampling

Note that the general structure of the above schemas could stay the same.
Note:
We needed to create a new schema the ones currently used are already deprecated.

Event Timeline

The best option I can think of so far would be:

  • Keep tracking pageview baseline statistics in the Cite extension but independent of a correlation to the fact if ReferencePreviews is enabled or disabled.
  • Keep adding handlers to track ReferenceContentLink and Footnote click interaction in the Cite extension but on click these take another global into account that's set if ReferencePreviews are enabled. *
  • Keep tracking pagviews in the Popups extension so we can use these to get the differences in the baseline right.
  • Fix the tracking in the Popups extension so we can really distinguish events that have the feature disabled / enabled. *

*) Taking disables due to conflicting gadgets etc into account.

At least we should have a clear cut after we changed the logic that triggers the logging.

This can be accomplished by bumping the event schema version in the producer. Our consumers can filter on this number.

Keep tracking pageview baseline statistics in the Cite extension but independent of a correlation to the fact if ReferencePreviews is enabled or disabled.

If we ignore the ReferencePreview enablement, then we can skip collection entirely and rely on the standard pageviews measurements instead, such as pageview_hourly.

Keep tracking pagviews in the Popups extension so we can use these to get the differences in the baseline right.

That sounds good—let's be sure to record this after the race condition, and after the conflicting gadget guard condition.

@Jdlrobson's work should also eliminate the race condition entirely, by allowing us to reimplement the Cite popups as a hook provided in the Cite extension, which only loads in the presence of both extensions.

Change 989142 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[schemas/event/secondary@master] Add schema to collect baseline events in Cite for reference previews

https://gerrit.wikimedia.org/r/989142

Change 989189 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[mediawiki/extensions/Popups@master] Set a global when reference previews are visible.

https://gerrit.wikimedia.org/r/989189

Change 989191 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[mediawiki/extensions/Cite@master] Fix event logging for the reference previews baseline

https://gerrit.wikimedia.org/r/989191

Change 989142 abandoned by WMDE-Fisch:

[schemas/event/secondary@master] Add schema to collect baseline events in Cite for reference previews

Reason:

Using monoschema instead.

https://gerrit.wikimedia.org/r/989142

Change 989204 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[operations/mediawiki-config@master] [beta] Allow Cite events for reference previews baseline stats

https://gerrit.wikimedia.org/r/989204

At least we should have a clear cut after we changed the logic that triggers the logging.

This can be accomplished by bumping the event schema version in the producer. Our consumers can filter on this number.

Keep tracking pageview baseline statistics in the Cite extension but independent of a correlation to the fact if ReferencePreviews is enabled or disabled.

If we ignore the ReferencePreview enablement, then we can skip collection entirely and rely on the standard pageviews measurements instead, such as pageview_hourly.

Keep tracking pagviews in the Popups extension so we can use these to get the differences in the baseline right.

That sounds good—let's be sure to record this after the race condition, and after the conflicting gadget guard condition.

@Jdlrobson's work should also eliminate the race condition entirely, by allowing us to reimplement the Cite popups as a hook provided in the Cite extension, which only loads in the presence of both extensions.

Perfect timing! :) T326692#9454918 is now ready for WMDE review!

Change 989204 merged by jenkins-bot:

[operations/mediawiki-config@master] [beta] Allow Cite events for reference previews baseline stats

https://gerrit.wikimedia.org/r/989204

Change 991440 had a related patch set uploaded (by Jdlrobson; author: Jdlrobson):

[mediawiki/extensions/Popups@master] Add new @stable event.Popups.SettingChange event

https://gerrit.wikimedia.org/r/991440

Change 989189 merged by jenkins-bot:

[mediawiki/extensions/Popups@master] Set a global when reference previews are visible.

https://gerrit.wikimedia.org/r/989189

Change 989191 merged by jenkins-bot:

[mediawiki/extensions/Cite@master] Fix event logging for the reference previews baseline

https://gerrit.wikimedia.org/r/989191

Change 992395 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[mediawiki/extensions/Cite@master] Don't submit metadata when using the monoschema

https://gerrit.wikimedia.org/r/992395

Change 992411 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[operations/mediawiki-config@master] Allow Cite events for reference previews baseline stats

https://gerrit.wikimedia.org/r/992411

Change 992411 merged by jenkins-bot:

[operations/mediawiki-config@master] Allow Cite events for reference previews baseline stats

https://gerrit.wikimedia.org/r/992411

Mentioned in SAL (#wikimedia-operations) [2024-01-24T08:05:56Z] <logmsgbot> wmde-fisch@deploy2002 Started scap: Backport for [[gerrit:992411|Allow Cite events for reference previews baseline stats (T353798)]]

Mentioned in SAL (#wikimedia-operations) [2024-01-24T08:07:43Z] <logmsgbot> wmde-fisch@deploy2002 wmde-fisch: Backport for [[gerrit:992411|Allow Cite events for reference previews baseline stats (T353798)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-01-24T08:17:18Z] <logmsgbot> wmde-fisch@deploy2002 Started scap: Backport for [[gerrit:992411|Allow Cite events for reference previews baseline stats (T353798)]]

Mentioned in SAL (#wikimedia-operations) [2024-01-24T08:18:49Z] <logmsgbot> wmde-fisch@deploy2002 wmde-fisch: Backport for [[gerrit:992411|Allow Cite events for reference previews baseline stats (T353798)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-01-24T08:25:50Z] <logmsgbot> wmde-fisch@deploy2002 Finished scap: Backport for [[gerrit:992411|Allow Cite events for reference previews baseline stats (T353798)]] (duration: 08m 32s)

Change 992631 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[operations/mediawiki-config@master] Add mediawiki.reference_previews to wgEventLoggingStreamNames

https://gerrit.wikimedia.org/r/992631

Change 992631 merged by jenkins-bot:

[operations/mediawiki-config@master] Add mediawiki.reference_previews to wgEventLoggingStreamNames

https://gerrit.wikimedia.org/r/992631

Mentioned in SAL (#wikimedia-operations) [2024-01-24T14:04:00Z] <logmsgbot> lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:992631|Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798)]]

Mentioned in SAL (#wikimedia-operations) [2024-01-24T14:05:35Z] <logmsgbot> lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Backport for [[gerrit:992631|Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-01-24T14:14:45Z] <logmsgbot> lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:992631|Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798)]] (duration: 10m 52s)

Change 991440 merged by jenkins-bot:

[mediawiki/extensions/Popups@master] Add new @stable event.Popups.SettingChange event

https://gerrit.wikimedia.org/r/991440

WMDE-Fisch updated the task description. (Show Details)

Change #992395 abandoned by WMDE-Fisch:

[mediawiki/extensions/Cite@master] Don't submit metadata when using the monoschema

Reason:

Will remove this whole tracking in foreseeable future anyways.

https://gerrit.wikimedia.org/r/992395