Background
With the deployment of Page Previews, we introduce a new form of reading Wikipedia content apart from the standard pageviews. We need to measure this for the same reasons as we do for pageviews. These include providing executives with accurate numbers on the overall level of usage of our content, and the editor community with accurate numbers on the readership of the individual articles and projects they are working on. In particular, based on the previous A/B tests, we expect that the deployment of previews on a wiki will cause the total pageviews to decrease for that wiki, but that "page interactions" – any intentional interaction with a page, i.e. page previews + pageviews – will increase. We would like a way to track this metric over time.
This task captures the frontend instrumentation work needed to ensure clients send all required information to the servers. The backend work for storing this data and aggregating it in a form suitable for analysis is captured in T186728.
QA Steps
On beta cluster with page previews enabled please visit article in debug mode (add ?debug=true to the URL). When a popup is visible for more than 1 second the VirtualPageView event should be triggered. (https://meta.wikimedia.org/wiki/Schema:VirtualPageView )
Notes
- In T184793#3953351, @Ottomata asks that we follow the EventLogging schema guidelines when creating the Pageview schema (strawdog name) so that it's easier to get events into Pivot and Superset.
AC
- Page interactions are recorded by logging an EventLogging event.
- Per T184793#3952974, this instrumentation should be sampled for testing purposes. The default sampling rate should be 50%.
- As soon as a preview has been visible for > 1000 ms, a page interaction is recorded with the following information:
- Namespace, title and ID of the page that's being previewed
- Namespace, title and ID of the page that's currently being viewed
- If a preview can't be generated for a page, i.e. the "generic" preview is shown (defined as in the Popups schema), then no page interaction is recorded.
- The instrumentation should be feature flagged and the flag should be disabled by default.
- The standard schema documentation has been added to the schema talk page and a purging strategy has been defined (in this case, no fields need to be whitelisted, as we're fine with all events being deleted after they have been aggregated per T186728.). (see https://meta.wikimedia.org/wiki/Schema_talk:VirtualPageView)
- Events are being sent consistently regarding DNT (T190188).
Open questions
Closed Questions
When a preview has been visible for > 1000 ms
This is subtly different from $totalInteractionTime > 1000 as $totalInteractionTime includes the >= 700 ms to actually show the preview. For reference, the current median time to show a preview is ~740 ms.
When do we want to record the page interaction?
See T184793#3896845. When $totalInteractionTime - $perceivedWait > 1000.
- Which URL do we want to request?
We'll use the EventLogging pipeline (per https://lists.wikimedia.org/pipermail/analytics/2018-January/006136.html).
3 The instrumentation should ignore DNT. Requests with the DNT header set are included in our webrequest_raw, webrequest, and pageviews (aggregated) tables and this instrumentation is to be considered analogous. After discussion in T187277, this subtask is tracked in T190188.
Testing criteria
- Hover over any link. Ensure the hover lasts at least 1s. Check that a message was sent to EventLogging to the VirtualPageViews schema. Check the contents of the message reflect the current page (source_) and the page being hovered.
- Hover over any link. Ensure the hover lasts less than 1s. Check that a message was NOT sent to EventLogging to the VirtualPageViews schema.