== Background
With the deployment of Page Previews, we introduce a new form of reading Wikipedia content apart from the standard pageviews. We need to measure this for the same reasons as we do for pageviews. These include providing executives with accurate numbers on the overall level of usage of our content, and the editor community with accurate numbers on the readership of the individual articles and projects they are working on. In particular, based on the previous A/B tests, we expect that the deployment of previews on a wiki will cause the total pageviews to decrease for that wiki, but that "page interactions" – any intentional interaction with a page, i.e. page previews + pageviews – will increase. We would like a way to track this metric over time.
This task captures the frontend instrumentation work needed to ensure clients send all required information to the servers. The backend work for storing this data and aggregating it in a form suitable for analysis is captured in T186728.
== QA Steps
On beta cluster with page previews enabled please visit article in debug mode (add `?debug=true` to the URL). WHen popup is visible for more than 1 second the `VirtualPageView` event should be triggered.
== Notes
1. In T184793#3953351, @Ottomata asks that we follow [[https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging/Schema_Guidelines | the EventLogging schema guidelines ]] when creating the Pageview schema (strawdog name) so that it's easier to get events into [[ https://wikitech.wikimedia.org/wiki/Analytics/Systems/Pivot | Pivot ]] and Superset.
== AC
[x] Page interactions are recorded by logging an EventLogging event.
[-] Per T184793#3952974, this instrumentation should be sampled **for testing purposes**. The default sampling rate should be 50%.
NOTE: We estimate that recording a page interaction when a preview has been open for > 1000 ms will correspond to an increase in webrequests per pageview of 0.13%, which corresponds to ~700-800 events/sec (or, roughly, 2x the peak rate from the Page Previews instrumentation). AIUI the Hive EventLogging backend can handle this event rate 💪 but the processors need to be monitored to see if more need to be added.
SIGN OFF NOTEI: This was not done and should not be necessary. We are able to enable this per wiki and having enabled this on Hungarian Wikipedia we are able to do the same.
[x] As soon as a preview has been visible for > 1000 ms, a page interaction is recorded with the following information:
- Namespace, title and ID of the page that's being previewed
- Namespace, title and ID of the page that's currently being viewed
[x] If a preview can't be generated for a page, i.e. the "generic" preview is shown (defined as in [[https://meta.wikimedia.org/wiki/Schema:Popups|the Popups schema]]), then no page interaction is recorded.
[x] The instrumentation should be feature flagged and the flag should be disabled by default.
= Open questions
[] The instrumentation should ignore DNT. Requests with the DNT header set are included in our webrequest_raw, [[https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest |webrequest]], and pageviews (aggregated) tables and this instrumentation is to be considered analogous. This is being discussed in T187277
---
== Closed Questions
1.
> When a preview has been visible for > 1000 ms
This is subtly different from `$totalInteractionTime > 1000` as `$totalInteractionTime` includes the >= 700 ms to actually show the preview. For reference, [[ https://grafana.wikimedia.org/dashboard/db/reading-web-page-previews?refresh=1m&panelId=6&fullscreen&orgId=1 | the current median time to show a preview is ~740 ms ]].
When do we want to record the page interaction?
See T184793#3896845. When `$totalInteractionTime - $perceivedWait > 1000`.
2. Which URL do we want to request?
We'll use the EventLogging pipeline (per https://lists.wikimedia.org/pipermail/analytics/2018-January/006136.html).