Page MenuHomePhabricator

Estimate how much of the Hovercards-caused pageview decrease comes from reduced usage of the back button
Closed, ResolvedPublic

Description

In the Hovercards A/B test on Hungarian Wikipedia, we found that the number of pages viewed per browser session went down with Hovercards enabled (T139319#2440480, T131366). This result was expected because the information in a card will often be sufficient for a reader, causing them not to click on a link they would have clicked otherwise to get this information.

But there might be an additional reason for the decrease: A reader who would have opened the linked article in the absence of a hovercard might also then have gone back to the article containing the link, which registers as another pageview in the Popups schema and in the webrequest log. This task is about comparing these "back button click" views (estimated as subsequent views of the same page during the same session) between hovercards on and off.

(Based on a question posed by @dr0ptp4kt , relayed by @JKatzWMF )

Event Timeline

Below is a query comparing views by anonymous users during the five weeks from July 29 to September 1. Defining unique views as the number of different pages loaded during one session, and making the simplifying assumption that all multiple views of the same page during the same session occur via the use of the back button, one first finds that the back button is indeed used less often when Hovercards are enabled:

Back button clicks occur 0.60 times per session (corresponding to 22% of pageviews) in the standard view, vs. 0.46 times (corresponding to 18% of pageviews) with Hovercards on.

This also means that, as expected, the decrease when Hovercards are switched on is smaller for unique views (-5.5%, from 2.19 to 2.06 per session in this sample) than for usual pageviews (-9.4%, from 2.79 to 2.52 per session).

SELECT unique_views_by_condition.popupEnabled AS popupEnabled, 
unique_views/sessions AS unique_views_per_session,
all_page_views/sessions AS all_views_per_session,
pageTokens/sessions AS pageTokens_per_session,
sessions
FROM
  (SELECT popupEnabled, COUNT(*) AS unique_views FROM (
    SELECT event_popupEnabled AS popupEnabled, 
    event_sessionToken, event_pageIdSource
    FROM log.Popups_15777589 
    WHERE wiki ='huwiki'
    AND event_isAnon = 1
    AND LEFT(timestamp, 8) >= '20160729'
    AND LEFT(timestamp, 8) < '20160902'
    AND event_action = 'pageLoaded' 
    AND NOT event_hovercardsSuppressedByGadget # only relevant for isAnon = 0
    GROUP BY event_popupEnabled,
    event_sessionToken, event_pageIdSource) 
    AS unique_views_list
  GROUP BY popupEnabled) AS unique_views_by_condition
JOIN (
  SELECT event_popupEnabled AS popupEnabled, 
  COUNT(*) AS all_page_views,
  COUNT(DISTINCT event_pageToken) AS pageTokens,
  COUNT(DISTINCT event_sessionToken) AS sessions
  FROM log.Popups_15777589 
  WHERE wiki ='huwiki'
  AND event_isAnon = 1
  AND LEFT(timestamp, 8) >= '20160729'
  AND LEFT(timestamp, 8) < '20160902'
  AND event_action = 'pageLoaded' 
  AND NOT event_hovercardsSuppressedByGadget # only relevant for isAnon = 0' 
  GROUP BY event_popupEnabled) AS distincts_by_condition
ON unique_views_by_condition.popupEnabled = distincts_by_condition.popupEnabled
GROUP BY popupEnabled;
+--------------+--------------------------+-----------------------+------------------------+----------+
| popupEnabled | unique_views_per_session | all_views_per_session | pageTokens_per_session | sessions |
+--------------+--------------------------+-----------------------+------------------------+----------+
|            0 |                   2.1842 |                2.7854 |                 2.7832 |   335214 |
|            1 |                   2.0635 |                2.5239 |                 2.5214 |   333446 |
+--------------+--------------------------+-----------------------+------------------------+----------+
2 rows in set (1 min 42.78 sec)

(As a data validity check, I counted normal pageviews in two different ways, as the number of pageLoaded events and as the number of distinct pageTokens. As expected, these numbers are almost the same, with the remaining difference likely explainable by the general EventLogging duplicate issue T142667.)

MBinder_WMF renamed this task from Estimate how much of the Hovercards-caused pageview decrease comes from reduced usage of the back button to [Spike: 2hrs] Estimate how much of the Hovercards-caused pageview decrease comes from reduced usage of the back button .Sep 6 2016, 5:12 PM
MBinder_WMF triaged this task as High priority.

Hi @MBinder_WMF (and @ovasileva ), I'm not totally clear about what the added sprint tags and the title change mean in this context. This is a Reading-analysis task assigned to myself, and so far we usually haven't been adding this kind of data analysis tickets to the web team's sprints unless they required some sort of engineering input, which this one doesn't. Happy to talk about changing this setup of course if you think it would be useful to integrate things more in this way. But in that case I would need some more information about how to interpret tags in this context.

(Also, this task is mostly done already; after posting the main result on Tuesday I only left it open in order to remind myself to also publish the corresponding result for logged-in users, and to add confidence intervals and/or suitable significance tests.)

@Tbayer Thanks for flagging. Tasks get added to the sprint and estimated (or timeboxed, for spikes) when the team needs to work on it. I believe this was added because it was determined that the team's attention was formally required, but as you've pointed out that might be incorrect.

(It's also possible the team needs their own task, so as not to mess with your team's process).

@ovasileva What do you think?

Reading over it now, I think it's purely an analysis task - removing the timebox.

ovasileva renamed this task from [Spike: 2hrs] Estimate how much of the Hovercards-caused pageview decrease comes from reduced usage of the back button to Estimate how much of the Hovercards-caused pageview decrease comes from reduced usage of the back button .Sep 12 2016, 12:15 PM
ovasileva lowered the priority of this task from High to Medium.Jun 7 2017, 5:20 PM

Here is the updated version of T144603#2611324 for the new enwiki+dewiki A/B tests that ran from October-November and December-February. Again defining unique views as the number of different pages loaded during one session, and making the simplifying assumption that all multiple views of the same page during the same session occur via the use of the back button (i.e. back button clicks = pageviews - unique views).

The result supports the hypothesis that the back button is used less often when Hovercards are enabled. However, the effect is much smaller on dewiki than on enwiki:

October 23-November 12, 2017:

wikipopupenabledunique_views_per_sessionall_views_per_sessionpagetokens_per_sessionsessionsbackbuttonclicks
dewikiFalse1.7053652.0519852.04851931442310.346621
dewikiTrue1.6582911.9986901.99531231471730.340399
enwikiFalse1.6415942.0308802.026974105916970.389286
enwikiTrue1.5955191.9553361.951575105826540.359817

Decrease when previews are on:

wikiunique_views_per_sessionall_views_per_sessionbackbuttonclicks
dewiki-2.8%-2.6%-1.8%
enwiki-2.8%-3.7%-7.6%

December 21, 2017-February 14, 2018:

wikipopupenabledunique_views_per_sessionall_views_per_sessionpagetokens_per_sessionsessionsbackbuttonclicks
dewikiFalse1.7431972.1169422.11316882182220.373744
dewikiTrue1.6913202.0536542.04990582128630.362335
enwikiFalse1.7071822.1359272.131925266218600.428745
enwikiTrue1.6457412.0359432.032078266055110.390202

Decrease when previews are on:

wikiunique_views_per_sessionall_views_per_sessionbackbuttonclicks
dewiki-3.0%-3.0%-3.1%
enwiki-3.6%-4.7%-9.0%

Note: As observed earlier at T182314#3974663, pageviews per session were higher - on both wikis - during the second test (and so were unique views and back button clicks), which is likely due to seasonal differences and a slight truncation effect (the second test lasted longer).

(Like before, as a data validity check, I counted normal pageviews in two different ways, as the number of pageLoaded events and as the number of distinct pageTokens. As expected, these numbers are almost the same.)

Data via

SELECT unique_views_by_condition.wiki AS wiki,
uv_popupEnabled AS popupEnabled, 
unique_views/sessions AS unique_views_per_session,
all_page_views/sessions AS all_views_per_session,
pageTokens/sessions AS pageTokens_per_session,
sessions
FROM
  (SELECT wiki, uv_popupEnabled, COUNT(*) AS unique_views FROM (
    SELECT wiki, event.popupEnabled AS uv_popupEnabled, 
    event.sessionToken, event.pageIdSource
    FROM tbayer.Popups
    WHERE (wiki ='enwiki' OR wiki ='dewiki')
    AND event.isAnon = 1
    AND year = 2017
    AND ( (month = 10 AND day >= 23) OR (month = 11 AND day <= 12) )  
    AND event.action = 'pageLoaded' 
    GROUP BY wiki, event.popupEnabled,
    event.sessionToken, event.pageIdSource) 
    AS unique_views_list
  GROUP BY wiki, uv_popupEnabled) AS unique_views_by_condition
JOIN (
  SELECT wiki AS wiki, event.popupEnabled AS d_popupEnabled, 
  COUNT(*) AS all_page_views,
  COUNT(DISTINCT event.pageToken) AS pageTokens,
  COUNT(DISTINCT event.sessionToken) AS sessions
  FROM tbayer.Popups
  WHERE (wiki ='enwiki' OR wiki ='dewiki')
  AND event.isAnon = 1
  AND year = 2017
  AND ( (month = 10 AND day >= 23) OR (month = 11 AND day <= 12) )  
  AND event.action = 'pageLoaded' 
  GROUP BY wiki, event.popupEnabled) AS distincts_by_condition
ON unique_views_by_condition.wiki = distincts_by_condition.wiki AND
unique_views_by_condition.uv_popupEnabled = distincts_by_condition.d_popupEnabled

SELECT unique_views_by_condition.wiki AS wiki,
uv_popupEnabled AS popupEnabled, 
unique_views/sessions AS unique_views_per_session,
all_page_views/sessions AS all_views_per_session,
pageTokens/sessions AS pageTokens_per_session,
sessions
FROM
  (SELECT wiki, uv_popupEnabled, COUNT(*) AS unique_views FROM (
    SELECT wiki, event.popupEnabled AS uv_popupEnabled, 
    event.sessionToken, event.pageIdSource
    FROM event.Popups
    WHERE (wiki ='enwiki' OR wiki ='dewiki')
    AND event.isAnon = 1
    AND ((month = 12 AND day >= 21) OR (month = 1) OR (month = 2 AND day <=14) )
    AND event.action = 'pageLoaded' 
    GROUP BY wiki, event.popupEnabled,
    event.sessionToken, event.pageIdSource) 
    AS unique_views_list
  GROUP BY wiki, uv_popupEnabled) AS unique_views_by_condition
JOIN (
  SELECT wiki AS wiki, event.popupEnabled AS d_popupEnabled, 
  COUNT(*) AS all_page_views,
  COUNT(DISTINCT event.pageToken) AS pageTokens,
  COUNT(DISTINCT event.sessionToken) AS sessions
  FROM event.Popups
  WHERE (wiki ='enwiki' OR wiki ='dewiki')
  AND event.isAnon = 1
  AND ((month = 12 AND day >= 21) OR (month = 1) OR (month = 2 AND day <=14) )
  AND event.action = 'pageLoaded' 
  GROUP BY wiki, event.popupEnabled) AS distincts_by_condition
ON unique_views_by_condition.wiki = distincts_by_condition.wiki AND
unique_views_by_condition.uv_popupEnabled = distincts_by_condition.d_popupEnabled