Page MenuHomePhabricator

Analyze page separation (Zebra #9) A/B test
Closed, ResolvedPublic

Assigned To
Authored By
ovasileva
Apr 25 2023, 7:22 PM
Referenced Files
F37159021: image.png
Aug 1 2023, 9:43 PM
F37159019: image.png
Aug 1 2023, 9:43 PM
F37159014: image.png
Aug 1 2023, 9:43 PM
F37152097: image.png
Jul 27 2023, 5:29 PM
F37152095: image.png
Jul 27 2023, 5:29 PM
F37152093: image.png
Jul 27 2023, 5:29 PM
F37152088: image.png
Jul 27 2023, 5:29 PM
F37152078: image.png
Jul 27 2023, 5:29 PM

Description

Background

This task will reflect the analysis of the A/B test for the page layout design

Hypothesis
The new layout does not negatively affect core Vector 2022 metrics

Analysis summary
Compare the following for the control and test groups:

  • Pageviews
  • Edits
  • ToC usage
  • Scrolling
  • Page tools usage

Reports

KPI metrics

Curiosity Questions

Deep dive

Event Timeline

ovasileva triaged this task as Medium priority.
ovasileva moved this task from Incoming to Analyst Consultation on the Web-Team-Backlog board.

@olga, Here is a summary of the initial results and key findings of the AB test analysis for pageviews and edits for review. Please let me know if you have any questions.

Methodology

We reviewed AB test data recorded from 02 June 2023 through 19 June 2023 for this analysis. Data was limited to pageviews and edits completed by desktop logged-in users who were selected in the AB test.

We compared the numbers of pageviews and edits between control and treatment groups, and ran hierarchical generalized linear modeling on session based data to determine whether the difference is statistically significant. We also reviewed the number of pageviews and edit rate in each wiki and editor experience level. As users could switch edit count groups during the experiment if their accumulated contributions reached a threshold, we categorized the sessions based on their edit count bucket at the beginning of the experiment.

Note that we excluded data collected from May 29 2023 to June 1 2023 in this analysis due to a bug which caused the edit button to be missing on the sticky header.

Pageviews
  • We observed an 4.2% increase in the pageviews per sessions from the users who were shown the new page separation across all participating Wikipedias and all editor experience levels.
  • We estimated that a 3.00% increase in the treatment group was attributed to the new page separation using the hierarchical generalized linear modeling on the session based data.
  • When broken down by the editor experience levels, it was observed that the treatment group exhibited an increase in page views per session in four categories: "0 edits," "1-4 edits," "100-999 edits," and "1000+ edits." However, there was a decrease in the "5-99 edits" category.
    • In addition we noticed that more experienced users were more active in reading.

image.png (980×1 px, 113 KB)

  • We observed hat the treatment group consisted of more active editors and fewer read-only users generally.

image.png (918×1 px, 116 KB)

  • When broken down by wiki, different trends were observed. On 3 wikis, enwiki, frwiki and thwiki, the treatment group has more pageviews per session than the control group overall. On 7 other wikis, the treatment group has fewer pageviews per session than the control group overall.

image.png (972×1 px, 138 KB)

  • When broken down by wiki and editor experience levels, no consistent pattern was observed across specific bucket groups. It is quite common for the treatment group to have more pageviews per session in some bucket groups while having fewer pageviews per session in other bucket groups.
    • Note that on srwiki, the treatment group has a significantly lower pageviews per session in 1000+ edits group than the control group.

image.png (1×1 px, 195 KB)

Edits

Edits is defined as the number of edit attempts initiated for the whole page from logged-in desktop users.
Edit rate is defined as the total number of page edit attempts to the total number of pageviews from logged-in desktop users.

  • We observed an overall 5.5% decrease in the edits per session (control 0.3451 ; treatment 0.3260, shown in below table) from the users who were shown the new page separation across all participating Wikipedias and all editor experience levels.
  • We observed a 9.7% decrease in edit rate (control 0.0196 ; treatment 0.0177, shown in below table) from the users who were shown the new page separation across all participating Wikipedias and all editor experience levels.
test_grouptotal edit attemptstotal pageviewstotal sessionsoverall edit per sessionoverall edit per pageview
control196715100560895699720.34510.0196
treatment17501198680075368390.32600.0177
  • We estimated that a 5.25% decrease in edits per session in the treatment group was attributed to the new page separation using the hierarchical generalized linear modeling on the session based data.
  • When broken down by wiki, different trends were observed.
    • On 4 wikis, enwiki, hewiki, thwiki and viwiki, the treatment group exhibited a higher number of edits per session compared to the control group However, on 6 other wikis, the treatment group had a lower number of edits per session compared to the control group overall.
    • Regarding the edit rate, 4 wikis (hewiki, srwiki, thwiki and viwiki) showed an increase in the treatment group while 6 other wikis exhibited a decrease in the treatment group.

image.png (982×1 px, 169 KB)
image.png (960×1 px, 167 KB)

  • When broken down by the editor experience levels, we noticed experienced users edited less in the treatment group.
    • We observed that the treatment group had fewer edits per session across the three most experienced editor categories. However, in the “0 edits” group and “1-4 edits” group, the treatment group demonstrates slightly more edits per session.
    • In terms of edit rate, it was generally lower in the treatment group across 4 edit count bucket groups, except for the “1-4 edits” group.

image.png (964×1 px, 124 KB)

image.png (982×1 px, 121 KB)

Remaining to-dos

To analyze three curiosity questions

  • Curiosity Question1: Does it impact the usage of TOC as the new layout separated ToC and content into different boxes?
  • Curiosity Question2: Does it impact the usage of Page tools? (Number of pinned or unpinned the page tools)
  • Curiosity Question3: Does it impact the scrolling to top behavior? (Nice to have)

To prepare reports and notebook for publishing

@jwang could we look at desktop click tracking? How did clicks to edit buttons compare across the groups in the experiment? My worry about analysis of edit attempts is that there are a myriad of factors that might impact those numbers outside Zebra.

Hi @Jdlrobson , I will cross check with desktop click tracking. Is the click on edit button logged as event.name='ca-edit' in DesktopWebUIActionsTracking ?

Edit Button Events Summary as of July 17, 2023

User ActionImage Referenceevent.name that fires
1 Click on the Sticky Header Edit Source Button
Screenshot 2023-07-17 at 5.17.24 PM.png (60×68 px, 4 KB)
ca-edit and wikitext-edit-sticky-header
1 Click on the Sticky Header VE Edit Button
Screenshot 2023-07-17 at 5.22.14 PM.png (98×106 px, 8 KB)
ca-ve-edit and ve-edit-sticky-header
1 Click on the Edit Source in Toolbar Section
Screenshot 2023-07-17 at 5.25.26 PM.png (168×206 px, 14 KB)
ca-edit
1 Click on the Edit Source link in h2 > .mw-headline and h3 > .mw-headline
Screenshot 2023-07-17 at 5.27.36 PM.png (66×400 px, 11 KB)
NO CLICK EVENTS

@KSarabia-WMF , thank you for the investigation.

@ovasileva, @Jdlrobson
I have cross checked the data from DesktopWebUIActionsTracking and editattemptstep. The trend of edit attempts from the sticky header, as collected by editattemptstep, unexpectedly experienced a low-level period that overlapped with our experiment period. On the other hand, the trend of edit button clicks from the sticky header, as collected by DesktopWebUIActionsTracking, aligns more closely with our expectations. It only exhibited a dip around June 1, 2023, caused by the issue of a missing editing button on the sticky header. (T337955) It recovered to its original level starting from June 2, 2023.

The daily number of edit attempts and number of sessions initiated from sticky header , as collected by editattemptstep, were low before June 9 2023,

image.png (1×1 px, 243 KB)
,
image.png (1×1 px, 241 KB)

The daily number of edit button clicks and number of sessions from sticky header, as collected by DesktopWebUIActionsTracking , only had a dip around June 1, 2023 as the effect of missing editing button on the sticky header.

image.png (1×1 px, 343 KB)

image.png (1×1 px, 313 KB)

Query

SELECT  TO_DATE(dt) AS event_date,t3.wiki, count(distinct event.session_token) AS session_n, 
 count(1) AS edit_attempts
FROM event.editattemptstep AS t3
WHERE t3.wiki IN ('hewiki', 'enwiki', 'fawiki', 'frwiki', 'kowiki', 'ptwiki', 'srwiki', 
'thwiki', 'trwiki', 'viwiki')
AND t3.year=2023 and t3.month IN ( 4, 5, 6)
AND t3.event.platform = 'desktop' 
-- only logged in users
AND t3.event.user_id != 0
AND t3.event.action = 'init'
AND t3.event.integration = 'page'
AND t3.event.init_type='page'
AND event.init_mechanism IN ('click-sticky-header', 'click-new-sticky-header')
GROUP BY TO_DATE(dt), t3.wiki


SELECT  TO_DATE(dt) AS event_date,t3.wiki,event.name as event_name,
count(distinct t3.event.token) AS session_n,
count(1) AS edit_clicks
FROM event.desktopwebuiactionstracking AS t3
WHERE t3.wiki IN ('hewiki', 'enwiki', 'fawiki', 'frwiki', 'kowiki', 'ptwiki', 'srwiki', 
'thwiki', 'trwiki', 'viwiki')
AND t3.year=2023 and t3.month IN (4,5,6)
AND NOT event.isanon 
AND event.action='click' 
AND event.name IN ('ca-edit','ca-ve-edit', 'wikitext-edit-sticky-header', 've-edit-sticky-header' )
AND event.skinversion=2 
GROUP BY  TO_DATE(dt), t3.wiki, event.name

Next step

  • To re-analyze the edits by measuring the number of clicks on edit button.

Here is the update of edit analysis by measuring clicks on the edit button.

Edit (clicks on edit button)

Edit is defined as the number of edit button clicks from logged-in desktop users.
Edit attempt rate is defined as the total number of edit button clicks to the total number of pageviews from logged-in desktop users.
Methodology is same as mentioned in T335379#9016641.

  • We observed an overall 0.87% increase in the edits per session (control 0.5037 ; treatment 0.5081, shown in below table) from the users who were shown the new page separation across all participating Wikipedias and all editor experience levels. But noticed treatment groups have a lower proportion of sessions with 0 edits and fewer edits per session for those that made edits.
  • We observed a 3.2% decrease in edit rate (control 0.0285 ; treatment 0.0276, shown in below table) from the users who were shown the new page separation across all participating Wikipedias and all editor experience levels.
test grouptotal edit attemptstotal pageviewstotal sessionsoverall edit per sessionoverall edit per pageview
control287068100560895699720.50370.0285
treatment27278098680075368390.50810.0276
  • We estimated that a 3.595% decrease in edits per session in the treatment group was attributed to the new page separation using the zero-inflated hierarchical generalized linear modeling on the session based data.
  • When broken down by wiki, different trends were observed.
    • On 3 wikis, enwiki, hewiki, viwiki, the treatment group exhibited a higher number of edits per session compared to the control group. On 2 wikis, ptwiki and frwiki, the difference is close to 0. On 5 other wikis, the treatment group had a lower number of edits per session compared to the control group overall.
    • Regarding the edit rate,3 wikis (ptwiki, frwiki, enwiki) did not show a significant difference between treatment group and control group. 2 wikis (hewiki, viwiki) showed an increase in the treatment group, while 5 other wikis exhibited a decrease in the treatment group.
Edits per session
image.png (1×1 px, 203 KB)
image.png (1×1 px, 96 KB)
Edit rate
image.png (1×1 px, 213 KB)
image.png (1×1 px, 96 KB)
  • When broken down by the editor's experience levels, different trends were observed.
    • We observed that the treatment group had fewer edits per session in 5-99 edits and `100-999 edits’ categories. However, in the “0 edits”, “1-4 edits” and ‘1000 or more edits’ categories, the treatment group demonstrates slightly more edits per session.
    • In terms of edit rate, it was generally lower in the treatment group across 4 edit count bucket groups, except for the “1-4 edits” group. The increase in pageviews did not result in the increase in edits.

image.png (1×1 px, 151 KB)

image.png (1×1 px, 147 KB)

Query

query_edit_clicks <- "
WITH t_ab_no_dupli AS (
SELECT  web_session_id, wiki, meta.domain AS domain, count(distinct `group` ) AS groups,  min(meta.dt) AS session_dt 
FROM event.mediawiki_web_ab_test_enrollment
WHERE wiki IN ('hewiki', 'enwiki', 'fawiki', 'frwiki', 'kowiki', 'ptwiki', 'srwiki', 
'thwiki', 'trwiki', 'viwiki') 
AND year=2023 AND month=6
AND CONCAT(year, '-', LPAD(month,2,'0'),'-', LPAD(day,2,'0')) BETWEEN '2023-06-02' AND '2023-06-19'
GROUP BY  web_session_id, wiki, meta.domain
-- exclude session ids are in both control and treatment group
HAVING groups < 2
),
t_ab AS(
SELECT 
 t1.web_session_id,
 t1.wiki,t1.meta.domain AS domain,
 t1.`group` AS test_group,
 min(t1.meta.dt) AS session_dt 
FROM event.mediawiki_web_ab_test_enrollment AS t1
INNER JOIN  t_ab_no_dupli AS t2 ON t1.wiki=t2.wiki 
AND t1.web_session_id=t2.web_session_id 
WHERE t1.wiki IN ('hewiki', 'enwiki', 'fawiki', 'frwiki', 'kowiki', 'ptwiki', 'srwiki', 
'thwiki', 'trwiki', 'viwiki')  
AND year=2023  and month=6
AND CONCAT(year, '-', LPAD(month,2,'0'),'-', LPAD(day,2,'0')) BETWEEN '2023-06-02' AND '2023-06-19'
AND NOT is_bot
AND NOT is_anon
AND skin='vector-2022'
GROUP BY  t1.web_session_id, t1.wiki,t1.meta.domain, t1.`group`
)
-- clicks on edit button from ab test group
SELECT  t3.event.token AS session_id, t3.event.pageToken AS page_token,
t3.wiki,   t4.test_group, 
count(1) AS edit_clicks
FROM event.desktopwebuiactionstracking AS t3
INNER JOIN t_ab AS t4 
ON  t3.wiki=t4.wiki AND t3.event.token = t4.web_session_id
WHERE t3.wiki IN ('hewiki', 'enwiki', 'fawiki', 'frwiki', 'kowiki', 'ptwiki', 'srwiki', 
'thwiki', 'trwiki', 'viwiki')
AND t3.year=2023 and t3.month=6
AND CONCAT(t3.year, '-', LPAD(t3.month,2,'0'),'-', LPAD(t3.day,2,'0')) BETWEEN '2023-06-02' AND '2023-06-19'
AND t4.session_dt <= t3.meta.dt
AND NOT event.isanon 
AND event.action='click' 
AND event.name IN ('ca-edit','ca-ve-edit')
AND event.skinversion=2 
GROUP BY  t3.event.token ,  t3.event.pageToken,  t3.wiki, t4.test_group 
"

Here is the summary of the analysis results for three curiosity questions about ToC usage, page tools usage, and scrolling.

Methodology

We compared the numbers of metrics between control and treatment groups using the data collected between June 2 and June 19 2023. We also reviewed the metrics in each edit bucket and wikis.

Curiosity Question1: Does it impact the usage of TOC as the new layout separated ToC and content into different boxes?

ToC clicks per session is defined as the number of ToC clicks over the total number of unique sessions.
ToC click rate is defined as total number of ToC clicks to total number of pageviews.

  • We observed an overall 14.1% decrease in the ToC clicks per session (control 1.0707 ; treatment 0.9197, shown in below table) from the users who were shown the new page separation across all participating Wikipedias and all editor experience levels.
  • We observed a 17.6% decrease in ToC click rate (control 0.0607 ; treatment 0.0500, shown in below table) from the users who were shown the new page separation across all participating Wikipedias and all editor experience levels.
test_groupall_toc_clicksall_pvsall_sessionstoc_click_per_sessiontoc_click_per_pv
control610246100560895699721.07070.0607
treatment49372298680075368390.91970.0500
  • When broken down by wiki, 8 out of 10 wikis have exhibited a decrease.
ToC click rate
image.png (968×1 px, 168 KB)
image.png (978×1 px, 81 KB)
ToC clicks per session
image.png (976×1 px, 170 KB)
image.png (970×1 px, 79 KB)
  • When broken down by the editor's experience levels, the decrease remains consistent across 4 edit count buckets except for 0 edit group.

image.png (976×1 px, 123 KB)

image.png (982×1 px, 122 KB)

Curiosity Question2: Does it impact the usage of Page tools? (Number of pins or unpins for the page tools)

Pin/unpin per session defined as the number of pins and unpins for the page tools over the total number of unique sessions.
The pin and unpin actions depend on the existing interface setting. The page tool is pinned by default. We expect to have more unpins than pins because more users initially land with the pinned page tool.

  • We observed that 99.93% of sessions did not pin/unpin page tool.
  • We observed a 83.47% increase in the page tool pins per session, a 190.98% increase in the page tool unpin per session from the users who were shown the new page separation across all participating Wikipedias and all editor experience levels.
test_groupall_pinsall_unpinsall_pvsall_sessionspins_per_sessionunpins_per_session
control3531415100560895699720.000619330.00248258
treatment610387898680075368390.001136280.00722377
  • The number of pins/unpin per session increased in the treatment group across wikis.
pinsunpins
image.png (976×1 px, 175 KB)
image.png (978×1 px, 169 KB)
  • When broken down by the editor's experience levels, the increase in treatment group remains consistent across all edit count buckets.
pinsunpins
image.png (988×1 px, 128 KB)
image.png (964×1 px, 133 KB)
Curiosity Question3: Does it impact the scrolling to top behavior? (Nice to have)

Below analysis measured the number of scrolls back to top over the number of scrolled pages.

  • We observed that 44.55% of sessions did not scroll to top.
  • We observed that the number of scrolls to top is same in treatment group and control group overall
test_groupall sessionsall pageviewsall viewed pagestotal scrolls to topall scrolled pagesscrolls_per_page
control5699721005608910054632232897013387681.74
treatment53683998680079866735230944913345751.73
  • We did not observed a significant difference in the number of scrolls to top between the control and treatment groups across wikis.

image.png (968×1 px, 128 KB)

  • We did not observed a significant difference in the number of scrolls to top between the control and treatment groups across edit count buckets.

image.png (980×1 px, 116 KB)

Remaining todo:
To prepare reports and notebooks for publishing

Could the decrease in TOC clicks be explained by an increase in un-pinning the table of contents? If the table of contents is hidden e.g. unpinned/collapsed that would explain a decrease in clicks (since users are seeing it less).

I'd also be interested in looking more closely at the completed edits - in particular bytes changed - what percentage of these edits are minor and which are significant/large in the treatment/control groups and are we seeing any changes? Are we able to do this sort of comparison?

@ovasileva , Here are the updates for the viewport size analysis

Summary

Majority of users use devices with 1200px-2000px viewport

image.png (1×1 px, 177 KB)

Pageviews

We observed that the treatment group exhibited an increase in page views per session in large viewport devices: 1200px-2000px, >2000px

image.png (1×1 px, 150 KB)

Across all wiki, pageviews increased in the ‘>2000px’ viewport size bucket.

image.png (1×1 px, 234 KB)

Edits (edit button clicks)

Edits per session decreased significantly on devices with smaller viewport.

image.png (1×1 px, 140 KB)

ToC clicks

ToC clicks on toc-heading.toc-pinned-enabled and toc-heading.toc-pinned-disabled increased significantly on smaller viewport buckets: <320px, 300px-719px, 720px-999px.
ToC clicks on ui.sidebar-toc are mostly from larger viewport size bucket.

image.png (1×1 px, 218 KB)

Page tool pins and unpins

Page tool pins and unpins increased across all viewport size buckets in the treatment group.

image.png (1×1 px, 132 KB)

image.png (1×1 px, 129 KB)

Broken down by browser family

Below analysis focused on the top 10 browser families.

Distribution

Below figure shows the distribution of the reader sessions by top 10 browsers. Majority of readers use Chrome browser.

image.png (1×1 px, 158 KB)

Pageviews

When broken down by browser family, it was observed that the treatment group exhibited an decrease in page views per session in all mobile browsers. It also decreased on Edge and Opera.

image.png (1×1 px, 155 KB)

Edits

Edits per session decreased on Chrome Mobile, Edge, Mobile Safari, Opera, Safari and Samsung Internet.
Edits per session increased on Chrome Mobile iOS, Firefox.

image.png (1×1 px, 152 KB)

Note:
Noticed 2775 sessions switched browsers during AB test. However, according to https://gerrit.wikimedia.org/r/plugins/gitiles/schemas/event/secondary/+/refs/heads/master/jsonschema/analytics/legacy/desktopwebuiactionstracking/current.yaml, event.token should not survive across browsers.

token:
            description: >-
              Session token that survives across pages (mw.user.sessionId()),
              but not when browser restarts.
jwang updated the task description. (Show Details)
jwang updated the task description. (Show Details)
jwang updated the task description. (Show Details)

@ovasileva , Analysis reports are finalized and published at github. (see in Description). This ticket is ready for sign-off.

Signing off, thank you @jwang!