Add a new field lifecycleActiveLength field to the ReadingDepth schema that measures how long a page was in the "active" state as per the Page Lifecyle API (https://developers.google.com/web/updates/2018/07/page-lifecycle-api#overview_of_page_lifecycle_states_and_events ).
This should probably be sent with pageUnloaded events (in addition to visibleLength, which will still be based on the older, more limited Page Visibility API, and totalLength), on browsers that support the Page Lifecyle API; on Chrome that's version 68 and newer. Or maybe a new, separate pageTerminated event (tied to the corresponding state transition per the Lifecycle API) would be preferable - developer input on this would be valuable.
Background
A limitation of the current ReadingDepth data is that the pageUnloaded event may not be fired if the page is terminated by the browser/system before being unloaded. On mobile, this actually happens in a majority of cases and biases our data especially when comparing reading times on desktop and mobile ( https://meta.wikimedia.org/wiki/Research_talk:Reading_time/Work_log/2018-11-02 ).
@Krinkle pointed out that the Page Lifecycle API should help with this. See also the spec at https://wicg.github.io/page-lifecycle/spec.html
AC
- The pageUnloaded EL event is logged when the pagehide event fires
- When the page becomes hidden (see the Page Visibility API), the state of the ReadingDepth instrumentation is persisted to session storage
- If a ReadingDepth event is logged during the lifecycle of the page, session storage is cleared
- When the page loads, any state persisted is reconstructed into a ReadingDepth pageUnloaded event and the event is logged
- We time how long the page is in the ACTIVE state
- The time is sent as part of a ReadingDepth pageUnloaded event
Notes
- Logging the pageUnloaded EL event when the pagehide event fires won't address the holes in the data that we're seeing (see Background). It's a minor performance improvement, that'll allow pages with the ReadingDepth enabled to be persisted in the browser's back/forward cache
- The HIDDEN -> freeze -> FROZEN -> DISCARDED state transition (see https://developers.google.com/web/updates/images/2018/07/page-lifecycle-api-state-event-flow.png) appears to be causing the holes in the data that we're seeing. Since we can't detect when a page is discarded, only when it has been discarded and then loaded at some future time, it makes sense to persist state until we can reliably send it.
- There's a well-defined pattern that we use to time how long the page is visible defined in https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/f39438dfb9163cf72ca82dd9fb345b575dfcf373/modules/all/ext.wikimediaEvents.readingDepth.js. We can re-use this pattern and exchange document.addEventListener( 'focus', ... ) and .addEventListener( 'blur', ... ) where we would use document.addEventListener( 'visibilitychange', ... )