Page MenuHomePhabricator

Understand current referencing behavior as baseline for ReferencePreviews
Closed, ResolvedPublic5 Estimated Story Points

Description

Motivation
This ticket is for creating a baseline we can compare ReferencePreviews data against. Maybe this data already exists :)

Acceptance Criteria

  • How often do people click on footnote indicators relative to the pages being opened? E.g. On average, there were 0.03 clicks on a footnote indicator per page opened where Reference Previews was NOT deployed.
  • How often do people click on a link in the reference if there is no reference previews enabled? (We should compare that number with the sum of clicks on links in the references pop up and clicks on links in the references section with beta feature enabled)

Preliminary outcome

For the first day of data, with N=13,570 we have:

0.006 footnote clicks per pageview.
0.003 reference content clicks per pageview

Event Timeline

Should probably use EventLogging as the backend. Sampling might be necessary to keep the volume manageable.

Since tracking outbound links will slow down the user's navigation, especially sample there. We have to wait until our metrics callback completes.

awight set the point value for this task to 5.Sep 17 2019, 12:44 PM

Look for precedents. Who do we ask? WMF Reading? Analytics?

We also need to track outbound clicks in T214493, so perhaps we split that feature out.

We're probably going to implement this in the Cite extension.

Since tracking outbound links will slow down the user's navigation, especially sample there. We have to wait until our metrics callback completes.

It turns out there is an industry-standard thing for this: https://developer.mozilla.org/en-US/docs/Web/API/Beacon_API

EventLogging supports beacons and Popups provides a nice wrapper. To illustrate:

Popups/src/getPageviewTracker.js:  const url = evLog.makeBeaconUrl( payload );
Popups/src/getPageviewTracker.js:  sendBeacon( url );

The trade-off is that our payload is delivered in the URL itself, by PUT rather than POST, so our schema and their values will have to fit into less than 2 000 chars.

Thiemo found this relevant deprecation task for the Popups instrumentation: T193051: Remove all page previews instrumentation code

There's a place in Schema:Popups for "reference" events, but it's unused, here's a breakdown of actual values:

select event_previewType, count(*) from Popups_16364296 group by event_previewType;
+-------------------+-----------+
| event_previewType | count(*)  |
+-------------------+-----------+
| NULL              | 291380619 |
| generic           |     60583 |
| page              |  19836700 |
+-------------------+-----------+

EventLogging doesn't support a guaranteed-beacon mode, i.e. browsers without beacon support will execute <img> fallback logic which does block the page unload when clicking on an outbound link. I would prefer to skip non-beacon browsers, we should discuss as a team.

Meanwhile, I ran a query to estimate how many non-beacon browsers we see in eventlogging data:

select
    count(*),
    http_method
from webrequest
where
    uri_path = '/beacon/event'
    and webrequest_source = 'text'
    and year = 2019
    and month = 9
    and day = 19
group by
    http_method;

682     HEAD
8154329 GET
109671810       POST

8154329 / (8154329 + 109671810)
= .069

Crudely conflating eventlogging numbers with unique users, this would mean that c. 7% of our users don't have beacon support. This is a big number, we need to either run the fallback code for them, or create a new metric to compensate, for example by logging "no beacon" on page load.

Here's a draft schema which should cover our needs for both the baseline and ReferencePreviews metrics:
https://meta.wikimedia.org/wiki/Schema:ReferencePreviews

I'll try a naive implementation in Cite.

Change 538261 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Cite@master] [WIP] Baseline reference interaction tracking

https://gerrit.wikimedia.org/r/538261

Waiting on code review—although we seem to be stuck in an "impossible problems of antiquity" loop, I'm really not sure how to respond to recent reviews.
https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Cite/+/538261/

Change 538261 merged by jenkins-bot:
[mediawiki/extensions/Cite@master] Baseline reference interaction tracking

https://gerrit.wikimedia.org/r/538261

This will be deployed with T214493 in this week's train, so we should monitor client metrics and check initial data for health.

We have a few hours of ReferencePreviewsBaseline in the hadoop event store, shaped like this:

select
  event.action as action,
  count(*)
from referencepreviewsbaseline
where
  year=2019
  and month=10
group by event.action;

action  _c1
pageview        194

That represents 194,000 pageviews, with few enough (roughly speaking, < 0.5%) reference interactions that none have been logged yet. All had referencePreviewsEnabled = false, unsurprisingly.

Everything looks good, let's wait for group2 deployment before increasing the sampling.

Looks like this sampling rate will be fine, we have enough data to start guessing:

clickedFootnote 83
clickedReferenceContentLink     44
pageview        13571

We even caught one person with reference previews enabled:

false   13697
true    1

I don't know how to estimate error margins, so I'll just present the naïve math with our small sample. I've removed the one sample with referencepageviews=true (it's just a pageview).

83 / 13 570 = 0.006 footnote clicks per pageview.
44 / 13 570 = 0.003 reference content clicks per pageview

Change 542419 had a related patch set uploaded (by Awight; owner: Awight):
[analytics/reportupdater-queries@master] New report for Reference Previews

https://gerrit.wikimedia.org/r/542419

We should slice this data by wiki, since reference usage probably varies between projects. If this is the case, then we would have to normalize or otherwise consider the site-specific baseline when measuring the impact of Reference Previews.

Slicing using reportupdater's explode_by does what we want, but doesn't come for free: the query has to be run for every wiki. My first reaction is that we should choose a small number of wikis to include in our analysis, for now.

Slicing using reportupdater's explode_by does what we want, but doesn't come for free: the query has to be run for every wiki.

I'm moping about this... We could also tack a "group by" onto the query, but I don't think the results can be easily mapped for reportupdater.

Well. There is indeed a huge variation between wikis, at least one order of magnitude. I won't paste the summary yet, this feels like something that should be treated more carefully and the math should be right before publishing.

For our goals, what this means is that we do need to analyze the impact of Reference Previews on each wiki independently.

Change 542419 merged by Mforns:
[analytics/reportupdater-queries@master] New reports for Reference Previews

https://gerrit.wikimedia.org/r/542419

Removing task assignee due to inactivity, as this open task has been assigned for more than two years. See the email sent to the task assignee on February 06th 2022 (and T295729).

Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome.

If this task has been resolved in the meantime, or should not be worked on ("declined"), please update its task status via "Add Action… 🡒 Change Status".

Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator.

thiemowmde claimed this task.