Page MenuHomePhabricator

QA ToC Instrumentation
Closed, ResolvedPublic

Description

Background

We have added the items required in the ToC instrumentation specification (https://docs.google.com/spreadsheets/d/1CAXGqV2JR_bE0ulo9EU3I74Jm0Ln9z1u2CjC6zvra7I/edit#gid=0) which will allow us to perform an A/B test on the new Table of Contents focused on answering the following questions:

  1. Is the new table of contents is used more frequently than the previous table of contents
  2. Does the new table of contents reduce the need to scroll back to the top of the page
  3. Does the new table of contents decrease the time people spend scrolling/scrolling quickly (if possible)
  4. How does the new table of contents affect the time spent on a page

Acceptance criteria

Ensure all instrumentation is behaving as expected:

Done

  • Schema QA : mediawiki_web_ui_scroll T306557
  • Schema QA : DesktopWebUIActionsTracking T306558
  • Schema QA: mediawiki_web_ab_test_enrollment T306559
  • Schema QA: mediawiki_reading_depth T306653

Event Timeline

ovasileva triaged this task as High priority.

What is tested:

  • Ensure the DesktopWebUIClickTracking schema is logging screen resolution as expected

Bugs or potential issues
In the desktopwebuiactionstracking schema , the viewportsizebucket field has mixed formats for the screen size. For example, for the same screen size, it was recored as 1200-2000 or 1200px-2000px. It shows on all wikis and all days since the instrument was enabled.

viewportsizebucket
NULL
>2000px
720-999
720px-999px
<320px
1000-1199
1000px-1199px
1200-2000
1200px-2000px
320-719

Query

SELECT DISTINCT event.viewportSizeBucket
FROM event.desktopwebuiactionstracking 
WHERE year = 2022 AND month = 4 AND day=17
AND event.name = 'ui.toc'

What is tested:

  • New events logged in the mediawiki_web_ui_scroll schema

Bugs or potential issues
In event.mediawiki_web_ui_scroll schema, the fr.wikipedia.org has most scroll-to-toc events and sessions, 4.25 times of the 2nd place pt.wikipedia.org. It's not proportional to the pageviews ratio of frwiki and ptwiki, which is 2.5 based on the 2021 pageviews in wiki comparison.

domaineventssessionspagesfirst_event_ts
fr.wikipedia.org4141810578142242022-04-07T08:16:54.576Z
pt.wikipedia.org9431248932572022-04-07T08:38:49.147Z

Query

SELECT meta.domain,  COUNT(1) AS events,
COUNT(DISTINCT web_session_id) AS sessions,
COUNT(DISTINCT page_id) AS pages ,
min(meta.dt) as first_event_ts
FROM  event.mediawiki_web_ui_scroll
WHERE year=2022 and month=4
AND action='scroll-to-toc'
GROUP BY meta.domain
ORDER BY events DESC
LIMIT 10000

There was a recent drop of events to this schema on https://grafana.wikimedia.org/d/000000566/overview?orgId=1&from=now-30d&to=now which needs explaining and could be contributing to the issue with events.

I've looked at events per session on frwiki and compared to jawiki, ptwiki and dewiki, (to see if maybe a single session responsible for a large number of events) and there doesn't seem to be any obvious outliers that would suggest that.

events by session

Screen Shot 2022-04-21 at 3.00.36 PM.png (1×1 px, 206 KB)
Screen Shot 2022-04-21 at 3.14.26 PM.png (1×1 px, 199 KB)
Screen Shot 2022-04-21 at 3.03.22 PM.png (1×1 px, 198 KB)
Screen Shot 2022-04-21 at 3.08.00 PM.png (1×1 px, 199 KB)
frwikijawikiptwikidewiki

Looking at events per day on frwiki, they seem to be evenly distributed as well (other than the drop on April 6th), so no spike that would account for extra events either :(

Screen Shot 2022-04-25 at 9.57.24 AM.png (1×1 px, 213 KB)
frwiki events per day

How many editors are there on French compared to Portuguese? Note, we only track scroll to top (currently) for logged-in users (although https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/773628/3/extension.json we introduced the possibility to log anon traffic)

The drop in events on the 7th is still a bit confusing to me and makes me worry there was an unexpected bug/change in behaviour in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/773628)

Change 785919 had a related patch set uploaded (by Jdlrobson; author: Jdlrobson):

[operations/mediawiki-config@master] [Web scroll] Restore original sampling rate

https://gerrit.wikimedia.org/r/785919

Change 785919 merged by jenkins-bot:

[operations/mediawiki-config@master] [Web scroll] Restore original sampling rate

https://gerrit.wikimedia.org/r/785919

Okay, so I can explain this.

Previously, we disabled the web ui scroll schema if the user was anonymous and in the sample (1% for French Wikipedia, 10% for all other projects)

In Ic7a50b30b0275dc37f09f80f491ab713cbb16294 merged on March 29th, we changed this so that:

  • The sampling rate applied to logged in users (1% for French Wikipedia, 10% for all other projects)
  • A new sampling rate applied to anonymous users set to 0%

This new change would have gone live on 7th April so this explains the significant drop we see here on the graph:

Screen Shot 2022-04-25 at 1.52.02 PM.png (1×2 px, 369 KB)

Even though we're adding an event for scroll to table of contents, we're also throwing away all our anon traffic, and a 90%+ of all logged in users.

The patch https://gerrit.wikimedia.org/r/785919 will rectify the issue by restoring the levels. After it's backported I predict we should see volume larger than the previous spike on 5th May as we should be seeing all the original events plus additional events for the scroll to table of contents event (given toc events seem to be on a par with scrolling to top we'll most likely see around 20 events a second (double the existing 10).

In event.mediawiki_web_ui_scroll schema, the fr.wikipedia.org has most scroll-to-toc events and sessions, 4.25 times of the 2nd place pt.wikipedia.org. It's not proportional to the pageviews ratio of frwiki and ptwiki, which is 2.5 based on the 2021 pageviews in wiki comparison.

If I look at the scroll-to-top event [1] before the introduction of the new toc event, the number of scroll to top events on Portuguese is also around 25% of the volume of events on French so it feels like this is a behavioural characteristic of PTwiki compared to FRWIKI. I'd love us to understand that a bit more but i don't think this is a bug in the instrumentation.

domaineventssessionspagesfirst_event_ts
fr.wikipedia.org13808387859773277842022-04-01T00:00:01.007Z
pt.wikipedia.org3453472135771060502022-04-01T00:00:00.871Z

[1]

SELECT meta.domain,  COUNT(1) AS events,
COUNT(DISTINCT web_session_id) AS sessions,
COUNT(DISTINCT page_id) AS pages ,
min(meta.dt) as first_event_ts
FROM  event.mediawiki_web_ui_scroll
WHERE year=2022 and month=4 and day < 7
AND action='scroll-to-top'
GROUP BY meta.domain
ORDER BY events DESC
LIMIT 10000

How many editors are there on French compared to Portuguese? Note, we only track scroll to top (currently) for logged-in users (although https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/773628/3/extension.json we introduced the possibility to log anon traffic)

As we only track scrolls by logged-in users, edits by logged-in users could be a reference other than pageviews.
Edits Metrics

Editsratio=frwiki : < other wiki>
domain_namelogged_in_editstotal_non_bot_editslogged_in_editstotal_non_bot_edits
fr.wikipedia.org6644147533261.001.00
pt.wikipedia.org1810001815723.674.15
fa.wikipedia.org1559591565734.264.81
he.wikipedia.org1525571763184.364.27
ko.wikipedia.org1136771646695.844.57
tr.wikipedia.org1100951227296.036.14
vi.wikipedia.org1002181252676.636.01

Scroll events

Scrollsratio=frwiki : < other wiki>
domaineventssessionspagesfirst_event_tseventssessions
fr.wikipedia.org4141810578142242022-04-07T08:16:54.576Z1.001.00
pt.wikipedia.org9431248932572022-04-07T08:38:49.147Z4.394.25
fa.wikipedia.org4907141515792022-04-07T08:14:36.540Z8.447.48
he.wikipedia.org7973193827322022-04-06T08:54:27.969Z5.195.46
ko.wikipedia.org28796879272022-04-07T08:17:57.554Z14.3915.40
tr.wikipedia.org4360118915772022-04-07T08:37:38.907Z9.508.90
vi.wikipedia.org4168108813422022-04-07T08:22:29.726Z9.949.72

The number of scroll events on fr.wikipedai.org is proportional to the number of edits by logged-in users on pt.wikipedia and he.wikipedia, though it's not proportional to other wikis. We believe it's not an issue as long as the sample rate is consistent during the experiment on each wiki.

Mentioned in SAL (#wikimedia-operations) [2022-04-25T21:25:13Z] <catrope@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:785919|[Web scroll] Restore original sampling rate (T305442)]] (duration: 01m 01s)

Thank you @Jdlrobson and @jwang for resolving this issue!

How about this issue?

What is tested:

  • Ensure the DesktopWebUIClickTracking schema is logging screen resolution as expected

Bugs or potential issues
In the desktopwebuiactionstracking schema , the viewportsizebucket field has mixed formats for the screen size. For example, for the same screen size, it was recored as 1200-2000 or 1200px-2000px. It shows on all wikis and all days since the instrument was enabled.

viewportsizebucket
NULL
>2000px
720-999
720px-999px
<320px
1000-1199
1000px-1199px
1200-2000
1200px-2000px
320-719

Query

SELECT DISTINCT event.viewportSizeBucket
FROM event.desktopwebuiactionstracking 
WHERE year = 2022 AND month = 4 AND day=17
AND event.name = 'ui.toc'

Please assign the ticket to me after engineer has resolved the issue on their side. I will continue to QA.

@jwang sorry, I should have been more explicit about that issue. There was a typo in the instrumentation as it was initially deployed (my bad).

The values without "px" were a typo. The corrected values, with "px" took effect on about April 12th. The wrong values might still persist for a while because of caching, but at a much lower rate.

Looking at the number of events received with the wrong values, it looks like they are slowly fading away, and the values with "px" are coming in steady since about April 12th.

Screen Shot 2022-04-27 at 11.22.12 PM.png (1×1 px, 162 KB)
Screen Shot 2022-04-28 at 12.02.03 AM.png (1×986 px, 165 KB)
"720-999" per day (typo)"720px-999px" per day (correct)
Screen Shot 2022-04-27 at 11.22.12 PM.png (1×1 px, 162 KB)
Screen Shot 2022-04-28 at 12.07.31 AM.png (1×946 px, 161 KB)
"320-719" per day (typo)"320px-719px" per day (correct)

The correct values for the schema are

<320px
320px-719px
720px-999px
1000px-1199px
1200px-2000px
>2000px

Both the 'px' and non-px values represent the same thing, but it may be easiest to just disregard the non-px events and use the ones with px.

@Jdrewniak Thank you very much for the info! I have documented it in the sub ticket T306558.

All individual tasks are now resolved. Closing this one as well