Fix the data collection for ReferencePreviews
Closed, ResolvedPublic
Actions

Description

As stated in https://phabricator.wikimedia.org/T347758#9416599 the data we gather is not ideal so to say.

The numbers in Schema:ReferencePreviewsBaseline and Schema:ReferencePreviewsCite could be wrong. There might be false positives counted for the former that should belong to the latter due to race conditions if the JS global is set in the Page-Previews extension.
The numbers in Schema:ReferencePreviewsCite might be way to high, because they do not reflect if Reference Previews are really visible. If conflicting gadgets are present events should not count as "with reference previews enabled".
Slightly unrelated - because we'll not compare absolute numbers - but: Schema:ReferencePreviewsBaseline numbers have a 1:1000 sampling applied that might or might not make sense for both schemas.

I see no really good way to retrospectively make sense out of these numbers because the flaws in their collection seem to impactful. We might want to create something new ( although the schemas seem fine ). At least we should have a clear cut after we changed the logic that triggers the logging.

Acceptance criteria:

Fix the event logging by:

Making sure we don't have race condition that depends on the order if different scripts to load
Take conflicting gadgets into account when counting events for the disable bucket
Make clear when the "new" logging started
Re-think where we want to apply sampling

~~Note that the general structure of the above schemas could stay the same.~~
Note:
We needed to create a new schema the ones currently used are already deprecated.

Details

Subject	Repo	Branch	Lines +/-
Don't submit metadata when using the monoschema	mediawiki/extensions/Cite	master	+9 -9
Add new @stable event.Popups.SettingChange event	mediawiki/extensions/Popups	master	+22 -12
Add mediawiki.reference_previews to wgEventLoggingStreamNames	operations/mediawiki-config	master	+1 -0
Allow Cite events for reference previews baseline stats	operations/mediawiki-config	master	+20 -19
Fix event logging for the reference previews baseline	mediawiki/extensions/Cite	master	+20 -24
Set a global when reference previews are visible.	mediawiki/extensions/Popups	master	+18 -10
[beta] Allow Cite events for reference previews baseline stats	operations/mediawiki-config	master	+19 -0
Add schema to collect baseline events in Cite for reference previews	schemas/event/secondary	master	+144 -0

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Open	None	T256004 Decide if we keep the wgPopupsReferencePreviews feature switch
Resolved	awight	T282999 Enable Reference Previews on all wikis using the Popups extension, on Nov 21
Resolved	None	T347758 Replace the Reference Previews Grafana board with a Superset board fitting our needs
Resolved	WMDE-Fisch	T353798 Fix the data collection for ReferencePreviews

Event Timeline

WMDE-Fisch created this task.Dec 20 2023, 12:08 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 20 2023, 12:08 PM

WMDE-Fisch updated the task description. (Show Details)Dec 20 2023, 12:33 PM

WMDE-Fisch updated the task description. (Show Details)Dec 20 2023, 12:36 PM

WMDE-Fisch added a project: WMDE-TechWish-Sprint-2023-12-06.Dec 20 2023, 3:11 PM

thiemowmde added a project: WMDE-TechWish-Sprint-2024-01-04.Jan 4 2024, 12:56 PM

thiemowmde triaged this task as High priority.Jan 4 2024, 1:18 PM

WMDE-Fisch claimed this task.Jan 8 2024, 10:22 AM

WMDE-Fisch moved this task from Sprint Backlog to Doing on the WMDE-TechWish-Sprint-2024-01-04 board.

The best option I can think of so far would be:

Keep tracking pageview baseline statistics in the Cite extension but independent of a correlation to the fact if ReferencePreviews is enabled or disabled.
Keep adding handlers to track ReferenceContentLink and Footnote click interaction in the Cite extension but on click these take another global into account that's set if ReferencePreviews are enabled. *
Keep tracking pagviews in the Popups extension so we can use these to get the differences in the baseline right.
Fix the tracking in the Popups extension so we can really distinguish events that have the feature disabled / enabled. *

*) Taking disables due to conflicting gadgets etc into account.

At least we should have a clear cut after we changed the logic that triggers the logging.

This can be accomplished by bumping the event schema version in the producer. Our consumers can filter on this number.

Keep tracking pageview baseline statistics in the Cite extension but independent of a correlation to the fact if ReferencePreviews is enabled or disabled.

If we ignore the ReferencePreview enablement, then we can skip collection entirely and rely on the standard pageviews measurements instead, such as pageview_hourly.

Keep tracking pagviews in the Popups extension so we can use these to get the differences in the baseline right.

That sounds good—let's be sure to record this after the race condition, and after the conflicting gadget guard condition.

@Jdlrobson's work should also eliminate the race condition entirely, by allowing us to reimplement the Cite popups as a hook provided in the Cite extension, which only loads in the presence of both extensions.

Change 989142 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[schemas/event/secondary@master] Add schema to collect baseline events in Cite for reference previews

https://gerrit.wikimedia.org/r/989142

gerritbot added a project: Patch-For-Review.Jan 9 2024, 1:33 PM

Change 989189 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[mediawiki/extensions/Popups@master] Set a global when reference previews are visible.

https://gerrit.wikimedia.org/r/989189

Change 989191 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[mediawiki/extensions/Cite@master] Fix event logging for the reference previews baseline

https://gerrit.wikimedia.org/r/989191

Change 989142 abandoned by WMDE-Fisch:

[schemas/event/secondary@master] Add schema to collect baseline events in Cite for reference previews

Reason:

Using monoschema instead.

https://gerrit.wikimedia.org/r/989142

Change 989204 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[operations/mediawiki-config@master] [beta] Allow Cite events for reference previews baseline stats

https://gerrit.wikimedia.org/r/989204

WMDE-Fisch moved this task from Doing to Tech Review on the WMDE-TechWish-Sprint-2024-01-04 board.Jan 9 2024, 5:06 PM

In T353798#9444594, @awight wrote:

At least we should have a clear cut after we changed the logic that triggers the logging.

This can be accomplished by bumping the event schema version in the producer. Our consumers can filter on this number.

Keep tracking pageview baseline statistics in the Cite extension but independent of a correlation to the fact if ReferencePreviews is enabled or disabled.

If we ignore the ReferencePreview enablement, then we can skip collection entirely and rely on the standard pageviews measurements instead, such as pageview_hourly.

Keep tracking pagviews in the Popups extension so we can use these to get the differences in the baseline right.

That sounds good—let's be sure to record this after the race condition, and after the conflicting gadget guard condition.

@Jdlrobson's work should also eliminate the race condition entirely, by allowing us to reimplement the Cite popups as a hook provided in the Cite extension, which only loads in the presence of both extensions.

Perfect timing! :) T326692#9454918 is now ready for WMDE review!

WMDE-Fisch added a project: WMDE-TechWish-Sprint-2024-01-17.Jan 17 2024, 9:43 AM

WMDE-Fisch moved this task from Sprint Backlog to Tech Review on the WMDE-TechWish-Sprint-2024-01-17 board.

Change 989204 merged by jenkins-bot:

[operations/mediawiki-config@master] [beta] Allow Cite events for reference previews baseline stats

https://gerrit.wikimedia.org/r/989204

Change 991440 had a related patch set uploaded (by Jdlrobson; author: Jdlrobson):

[mediawiki/extensions/Popups@master] Add new @stable event.Popups.SettingChange event

https://gerrit.wikimedia.org/r/991440

Change 989189 merged by jenkins-bot:

[mediawiki/extensions/Popups@master] Set a global when reference previews are visible.

https://gerrit.wikimedia.org/r/989189

Change 989191 merged by jenkins-bot:

[mediawiki/extensions/Cite@master] Fix event logging for the reference previews baseline

https://gerrit.wikimedia.org/r/989191

ReleaseTaggerBot added a project: MW-1.42-notes (1.42.0-wmf.15; 2024-01-23).Jan 18 2024, 2:00 PM

Change 992395 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[mediawiki/extensions/Cite@master] Don't submit metadata when using the monoschema

https://gerrit.wikimedia.org/r/992395

Change 992411 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[operations/mediawiki-config@master] Allow Cite events for reference previews baseline stats

https://gerrit.wikimedia.org/r/992411

Change 992411 merged by jenkins-bot:

[operations/mediawiki-config@master] Allow Cite events for reference previews baseline stats

https://gerrit.wikimedia.org/r/992411

Mentioned in SAL (#wikimedia-operations) [2024-01-24T08:05:56Z] <logmsgbot> wmde-fisch@deploy2002 Started scap: Backport for [[gerrit:992411|Allow Cite events for reference previews baseline stats (T353798)]]

Mentioned in SAL (#wikimedia-operations) [2024-01-24T08:07:43Z] <logmsgbot> wmde-fisch@deploy2002 wmde-fisch: Backport for [[gerrit:992411|Allow Cite events for reference previews baseline stats (T353798)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-01-24T08:17:18Z] <logmsgbot> wmde-fisch@deploy2002 Started scap: Backport for [[gerrit:992411|Allow Cite events for reference previews baseline stats (T353798)]]

Mentioned in SAL (#wikimedia-operations) [2024-01-24T08:18:49Z] <logmsgbot> wmde-fisch@deploy2002 wmde-fisch: Backport for [[gerrit:992411|Allow Cite events for reference previews baseline stats (T353798)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-01-24T08:25:50Z] <logmsgbot> wmde-fisch@deploy2002 Finished scap: Backport for [[gerrit:992411|Allow Cite events for reference previews baseline stats (T353798)]] (duration: 08m 32s)

Change 992631 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[operations/mediawiki-config@master] Add mediawiki.reference_previews to wgEventLoggingStreamNames

https://gerrit.wikimedia.org/r/992631

Change 992631 merged by jenkins-bot:

[operations/mediawiki-config@master] Add mediawiki.reference_previews to wgEventLoggingStreamNames

https://gerrit.wikimedia.org/r/992631

Mentioned in SAL (#wikimedia-operations) [2024-01-24T14:04:00Z] <logmsgbot> lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:992631|Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798)]]

Mentioned in SAL (#wikimedia-operations) [2024-01-24T14:05:35Z] <logmsgbot> lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Backport for [[gerrit:992631|Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-01-24T14:14:45Z] <logmsgbot> lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:992631|Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798)]] (duration: 10m 52s)