Page MenuHomePhabricator

investigate instrumentation of section opens/scrolls
Closed, ResolvedPublic

Description

Is there a way to add myself to a bucket to verify event log instrumentation behavior? perhaps via url appendage?

As noted on this ticket, Tilman is getting some very unexpected results from his query:
https://phabricator.wikimedia.org/T128536#2086621

Related Objects

StatusAssignedTask
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
DeclinedNone
OpenNone
Resolveddr0ptp4kt
DuplicateJhernandez
Duplicatedr0ptp4kt
OpenNone
Resolved Jdlrobson
Resolvedovasileva
DeclinedNone
Resolved Tbayer
Resolved Jdlrobson

Event Timeline

JKatzWMF created this task.Mar 5 2016, 12:01 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 5 2016, 12:01 AM

Yes, I too would be interested in ways to test the instrumentation. JonR mentioned earlier in https://phabricator.wikimedia.org/T121772#1889440 how to force events being logged (i.e. how to put oneself in the 1% sample). But note that the code for the instrumentation has already been undeployed since (T124220 ).

I was also wondering earlier how to examine the behavior (not necessarily the events sent) in the test group where the default view was changed to show all sections uncollapsed on initial page load.

(moving further sanity checks and debugging here, so as not to clog up the actual analysis at T128536 )

Here is a tally of event names by group (test vs. control) as a sanity check:

SELECT event_isTestA, event_eventName, COUNT(*) FROM log.MobileWebSectionUsage_15038458 WHERE timestamp LIKE '2016010700%' GROUP BY event_isTestA, event_eventName

event_isTestA   event_eventName COUNT(*)
NULL    entered 2
0       close-section   2703
0       entered 36432
0       open-section    23362
0       scrolled-into-view      126033
1       close-section   5677
1       entered 36305
1       open-section    1280
1       scrolled-into-view      76663

This is consistent with isTestA being set to FALSE (= 0) for the control group with default (collapsed) view and TRUE (=1) for the test group seeing the expanded view:

In the control group (collapsed), there are many more open actions than close actions, and more scroll actions too, presumably because the viewport contains more sections at once. In the test group we actually see quite a few close actions (it's a separate question whether these are accidental or made intentionally by users trying to remove irrelevant article parts from their view while reading). Their number is larger than the number of open actions, as it should be in that condition.

Tbayer added a comment.EditedMar 7 2016, 9:20 PM

PS: the same sanity check with the additional conditions as in the query we are trying to reproduce (e.g. excluding tablets) gives the same picture:

mysql:research@analytics-store.eqiad.wmnet [(none)]> SELECT event_isTestA, event_eventName, COUNT(*) FROM log.MobileWebSectionUsage_15038458 WHERE timestamp LIKE '2016010700%' AND event_isTablet = 0 AND event_namespace = 0 AND event_sessionId <> '' AND event_pageId <> 0 GROUP BY event_isTestA, event_eventName;
...
+---------------+--------------------+----------+
| event_isTestA | event_eventName    | COUNT(*) |
+---------------+--------------------+----------+
|          NULL | entered            |        2 |
|             0 | close-section      |     2510 |
|             0 | entered            |    30827 |
|             0 | open-section       |    23118 |
|             0 | scrolled-into-view |   114001 |
|             1 | close-section      |     5485 |
|             1 | entered            |    30615 |
|             1 | open-section       |     1179 |
|             1 | scrolled-into-view |    64674 |
+---------------+--------------------+----------+
9 rows in set (57 min 0.42 sec)

Update: JonK and I sat down to figure out why this query for the new version of the schema did not reproduce the earlier result about the frequency of section opening actions in the standard view. After quite some detail investigation (including looking at the event series for individual pageview sessions, which was instructive in other ways - will write more about that), it turns out to be the first of the two possible reasons anticipated ;) - one of the event name values was written wrong, event_eventName = 'opened' (which would be more consistent with the other values, but we chose event_eventName = 'open-section' in this schema). I have posted the corrected result in the main task.
One reason why this typo was not identified earlier points to another, older issue: the counting method assumes that there is only one "entered" event for each session and page, and thus the query seemed to show a small number of open events (69x one open, 2x two opens), which threw me off track when searching for the error. Whereas in reality no open events were counted due to the typo, and instead this counts pages being entered twice or thrice during one session. I'm not quite sure this counting behavior is intentional under the instrumentation (perhaps it is expected when the user uses the back button or the reload button, but then again it 71 out of 30754 pageviews would seem really rare for that). Regardless, because this overcounting appears to happen so rarely, I think it's reasonable to continue using this counting method.

So as mentioned, I have been looking at the event sequence for several individual pageview sessions, which has been instructive. One question about the instrumentation arose where JonK and I would like to double-check our assumptions:

When a page is first opened and there are sections within the viewport on initial view, do they trigger a "scrolled-into-view" event or not? My guess is not, but there are a few example sessions where there are a lot of "scrolled-into-view" events within the same second as the "entered" event, cf. below. This could of course just be a fast-fingered user. (Both are from the control group with event_isTestA = 0, i.e. the view where all sections are collapsed initially.)

The relevant code seems to be at https://gerrit.wikimedia.org/r/#/c/257820/6/resources/mobile.toggle/toggle.js @bmansurov or @Jdlrobson , could you give this a quick look and confirm or dispel the above assumption?

For illustration, two example sessions:

In https://en.m.wikipedia.org/wiki/Vallabhbhai_Patel :

SELECT timestamp, event_eventName, event_section FROM log.MobileWebSectionUsage_15038458 WHERE timestamp LIKE '[date redacted]000%' AND event_sessionId = '[redacted]' AND event_pageId = 59659 AND wiki = 'enwiki' ORDER BY timestamp;
timestamp       event_eventName event_section
[date redacted]000058  scrolled-into-view      10
[date redacted]000058  scrolled-into-view      9
[date redacted]000058  scrolled-into-view      7
[date redacted]000058  scrolled-into-view      12
[date redacted]000058  scrolled-into-view      2
[date redacted]000058  entered NULL
[date redacted]000058  scrolled-into-view      11
[date redacted]000058  scrolled-into-view      4
[date redacted]000058  scrolled-into-view      3
[date redacted]000058  scrolled-into-view      6
[date redacted]000058  scrolled-into-view      5
[date redacted]000058  scrolled-into-view      8
[date redacted]000058  scrolled-into-view      0
[date redacted]000058  scrolled-into-view      1

In https://en.m.wikipedia.org/wiki/Karl_Marx :

SELECT timestamp, event_eventName, event_section FROM log.MobileWebSectionUsage_15038458 WHERE timestamp LIKE '[date redacted]000%' AND event_sessionId = '[redacted]' AND event_pageId = 16743 AND wiki = 'enwiki' ORDER BY timestamp
timestamp       event_eventName event_section
[date redacted]000131  scrolled-into-view      11
[date redacted]000131  scrolled-into-view      9
[date redacted]000131  scrolled-into-view      8
[date redacted]000131  scrolled-into-view      13
[date redacted]000131  scrolled-into-view      12
[date redacted]000131  entered NULL
[date redacted]000131  scrolled-into-view      10
[date redacted]000132  scrolled-into-view      3
[date redacted]000132  scrolled-into-view      4
[date redacted]000132  scrolled-into-view      6
[date redacted]000132  scrolled-into-view      2
[date redacted]000132  scrolled-into-view      7
[date redacted]000132  scrolled-into-view      1
[date redacted]000132  scrolled-into-view      0
[date redacted]000132  scrolled-into-view      5

(I don't think the session ID and date are PII, but removing them just in case, since they are not relevant here.)

So looking at https://github.com/wikimedia/mediawiki-extensions-MobileFrontend/commit/e3894a5d15a2abb96064396a15376c25346f1277

it seems like logWhenHeadingIsScrolledIntoView is called for each heading on _enable so any headings that are visible in the first view will get logged.

That said I think this is rare. This will only happen on pages with no lead section. When there is a lead section, most of the time headings will be pushed below the fold. I'd suggest you can verify this by finding an article which has scrolled-into-view events occurring at the same time as the entered event and look at the article (with revision) to see whether it fits this mould.

Hope this helps.

Tbayer added a comment.EditedMar 15 2016, 7:55 PM

For the record, I tried to check this by looking at the events sequence for a page where sections headings are visible on initial load. They definitely exist (say
https://en.m.wikipedia.org/wiki/Telestar ) but tend to be stubs with far too few pageviews to be examined in an efficient manner with this schema, considering the 1% sampling rate and the fact that querying data for a single day tends to take several hours - I gave up finding a good example after hitting "random article" about 20-30 times.

While doing that it occurred to me that that quite a few pages currently display a TOC together with collapsed sections (for non-tablet display sizes), which would put much of this analysis here into question. Fortunately, per JonR this is a recent issue (T124356) that should not affect the data from December/January that is the focus here.

For the record (continuing to use this ticket to list data issues that came up during this investigation): Last week I noticed that the same query gave quite different results depending on whether it was run on s1-analytics-slave.eqiad.wmnet or analytics-store.eqiad.wmnet. (see also the IRC conversation on #wikimedia-analytics). It turns out that this issue affects other EventLogging tables too besides this one. I'm going to file a general bug soon, but for now I have been relying on the data on s1 (which, I've been told, is actually a backup master, despite the name).

For the record (continuing to use this ticket to list data issues that came up during this investigation): Last week I noticed that the same query gave quite different results depending on whether it was run on s1-analytics-slave.eqiad.wmnet or analytics-store.eqiad.wmnet. (see also the IRC conversation on #wikimedia-analytics). It turns out that this issue affects other EventLogging tables too besides this one. I'm going to file a general bug soon,

Now filed as T131236: Identical EventLogging queries give different results on db1047 and dbstore1002

Continuing to use this ticket to list data issues that came up during this investigation:

Earlier in T128536#2156793 I had observed that the session depth (pageviews per session) was surprisingly low, and that we should check the mechanics of the instrumentation about this. @dr0ptp4kt has since observed that the field in question, sessionId (implemented in T119146 by @bd808), might not quite behave as specified in the schema, and only persists during navigation within a single browser tab instead of a browser session. Unfortunately this bug means that the planned comparison of session lengths (no. 4 in T128536) becomes impossible.

@Tbayer, I think we were discussing stuff in the context of Hovercards and the need to determine sample bucketing using the session ID instead of a chaotic variable that changes between page transitions.

I believe the session ID in question survived tabs within the same browser session, but the eligibility for being sampled in wasn't derived from that session ID.

MobileFrontend at commit 97a6c5b5f0f3ba8ad8db8747fafeaead3b4d2c29 suggests to me that the sampling rate at the time was 1%, with the default sample bucketing using a chaotic independent variable. Someone more familiar with the code might be able to speak to whether my interpretation is correct, though.

$ grep samplingRate resources/mobile.loggingSchemas/SchemaMobileWebSectionUsage.js
		samplingRate: 0.01,

$ grep isInBucket resources/mobile.startup/Schema.js
		_isInBucket: undefined,
			} else if ( this._isInBucket === undefined ) {
				this._isInBucket = this.isSampled && Math.random() <= this.samplingRate;
			return this._isInBucket;

@dr0ptp4kt: Right, thanks for clarifying and digging out this code snippet! (My memory of our conversation was a bit hazy.)
To put it another way, it appears that different page views by the same reader in the same session were sometimes in the sampled group, sometimes not - explaining the implausibly low session depth observed in T128536#2156793 .

In any case, it is too late to be fixed here, but should be kept in mind as we instrument other web A/B tests that similarly use session IDs.

@Tbayer, yeah, that's my read. The seven day window in the query and the nature of random sampling affords the possibility that some users would have some number of events > 1, but very low like you observed.

Good to know, thanks! What is this bucketing based on? (session ID, user ID, IP, ....?)

This web page linked to from the web team page may be of interest:
https://www.mediawiki.org/wiki/Reading/Web/EventLogging_best_practices

Yes, it would be good to add more information about good sampling practices there. In the current form that page shouldn't yet be relied upon though - e.g. I have just removed one erroneous statement (see talk page); also I think the advice about sampling rates may be too generic; e.g. regarding tests targeting smaller wikis.

@Tbayer Any more questions about this? Not sure what is left to help you on this. IT should be resolved or updated if there are further questions.

Jhernandez changed the task status from Open to Stalled.May 25 2016, 4:45 PM
Jhernandez triaged this task as Medium priority.
Jhernandez moved this task from Incoming to Needs Prioritization on the Readers-Web-Backlog board.
Tbayer closed this task as Resolved.Jun 15 2016, 10:40 PM

@Tbayer Any more questions about this? Not sure what is left to help you on this. IT should be resolved or updated if there are further questions.

Thanks - I'm closing this now, as no further instrumentation issues came up after the behavior on page load was clarified and we identified the (too late to fix) sampling bug above. (Inbetween I also used this as a kind of tracking bug to record other issues that came up during the analysis, but those have since been filed and handled elsewhere, and this task should probably have stayed focused on instrumentation aspects anyway.)