As stated in https://phabricator.wikimedia.org/T347758#9416599 the data we gather is not ideal so to say.
- The numbers in Schema:ReferencePreviewsBaseline and Schema:ReferencePreviewsCite could be wrong. There might be false positives counted for the former that should belong to the latter due to race conditions if the JS global is set in the Page-Previews extension.
- The numbers in Schema:ReferencePreviewsCite might be way to high, because they do not reflect if Reference Previews are really visible. If conflicting gadgets are present events should not count as "with reference previews enabled".
- Slightly unrelated - because we'll not compare absolute numbers - but: Schema:ReferencePreviewsBaseline numbers have a 1:1000 sampling applied that might or might not make sense for both schemas.
I see no really good way to retrospectively make sense out of these numbers because the flaws in their collection seem to impactful. We might want to create something new ( although the schemas seem fine ). At least we should have a clear cut after we changed the logic that triggers the logging.
Acceptance criteria:
Fix the event logging by:
- Making sure we don't have race condition that depends on the order if different scripts to load
- Take conflicting gadgets into account when counting events for the disable bucket
- Make clear when the "new" logging started
- Re-think where we want to apply sampling
Note that the general structure of the above schemas could stay the same.
Note:
We needed to create a new schema the ones currently used are already deprecated.