Page MenuHomePhabricator

Code for InukaPageView instrumentation
Closed, ResolvedPublic

Description

Instrument MediaWiki (probably in WikimediaEvents extensions) to capture the mobile behavior baseline described in https://docs.google.com/document/d/1ulTCYUC3Uhkf7tUfA3y1OSH7ahCMn_EvThAh6ROCFhI/edit

Event Timeline

SBisson edited projects, added Inuka-Team (Kanban); removed Inuka-Team.

Change 551259 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/extensions/WikimediaEvents@master] Instrumentation for InukaPageView

https://gerrit.wikimedia.org/r/551259

Krinkle added subscribers: Catrope, Krinkle.
@Krinkle wrote on Gerrit (Patch Set 13):

This seems to vary server-response for anon users by aspects (user-agent, Geo, personal ID) that do not vary the CDN cache keys, which we do not support as it would pollute the cache and not work at our scale.

@Catrope I imagine you're familiar with this already, which means either I misunderstood something, or there's a way you plan to work around this. If that is the case, consider this unblocked from MediaWiki perf perspective, and instead consider it blocked on CDN/Traffic perspective.

It is my understanding that we cannot allow currently regular anon page views to produce cookied/uncached responses even if we signal that correctly to the CDN layer, it as would break its ability to handle high traffic load for other users in ways that "Bad Things Might Happen". Take on further with Traffic to figure out.

My summary of the gerrit discussion:

The patch was written in a way that completely ignored the reality of the edge caching we have in production. It would have not worked and caused issues. Thanks @Krinkle for catching it and explaining how that works.

To implement event logging for anon users, the tracking script has to be sent to all users and the script itself has to figure out if a user should be included in the tracking or not. This is how other similar tracking (print, readingDepth, searchSatisfaction, etc) is implemented. This is problematic because it increases the payload and processing for a large number of users.

I have reworked my patch to work like other similar tracking. I would like to invite @Krinkle and @BBlack to voice any concern they have about adding an additional tracking script or the way to do it.

@SBisson I looked over the patch and the schema and I have a number of changes to suggest:

  • My intention for the time_on_page field was to know the amount of time the user had the page visible/focused, but currently it just gives the absolute amount of time from page load to page unload. If it's feasible to track visibility time, could we add that as page_visible_time (and maybe rename time_on_page to page_open_time)?
  • I just saw your comment about sections being open by default on mobile web; on Chrome for Android, I see them closed by default. Were you using the mobile website on a desktop browser? I think that when the screen size is tablet-size or higher, sections are opened by default, so they'd be closed by default for KaiOS devices. If that's the case, perhaps we can add a field like section_open_count which tracks the number of sections opened (ideally we'd track first time opens/unique section opens only, but it's fine if it's just a raw count with multiple opens of the same section included). This would also require a count of the number of sections on the page to contextualize.
  • The dt field that's recorded by default will give the time that the event is received by the EventLogging server. Since we'll be holding events until the page unload, we should have a client_dt or load_dt field that gives the actual time the page was loaded so we can do our sessionizing accurately.
  • The schema in its current form won't explicitly tell us if the user is on Special:Search. The fact that the page namespace will be -1 will be a strong signal, but is there any reason not to include an is_search_page variable?
In T238029#5679932, @Neil_P._Quinn_WMF wrote:

@SBisson I looked over the patch and the schema and I have a number of changes to suggest:

  • My intention for the time_on_page field was to know the amount of time the user had the page visible/focused, but currently it just gives the absolute amount of time from page load to page unload. If it's feasible to track visibility time, could we add that as page_visible_time (and maybe rename time_on_page to page_open_time)?

I've added page_open_time and page_visible_time.

  • I just saw your comment about sections being open by default on mobile web; on Chrome for Android, I see them closed by default. Were you using the mobile website on a desktop browser? I think that when the screen size is tablet-size or higher, sections are opened by default, so they'd be closed by default for KaiOS devices. If that's the case, perhaps we can add a field like section_open_count which tracks the number of sections opened (ideally we'd track first time opens/unique section opens only, but it's fine if it's just a raw count with multiple opens of the same section included). This would also require a count of the number of sections on the page to contextualize.

Right, this is responsive. They are initially collapsed on screens narrower than 720px.

I've added section_count and opened_section_count. It's a unique count. I think it tells us more about the "depth" of their interaction with the page.

  • The dt field that's recorded by default will give the time that the event is received by the EventLogging server. Since we'll be holding events until the page unload, we should have a client_dt or load_dt field that gives the actual time the page was loaded so we can do our sessionizing accurately.

Sure, I can add that. What format do you want the date time in?

  • The schema in its current form won't explicitly tell us if the user is on Special:Search. The fact that the page namespace will be -1 will be a strong signal, but is there any reason not to include an is_search_page variable?

DONE

Thank you for the quick responses, @SBisson!

In T238029#5679932, @Neil_P._Quinn_WMF wrote:
  • The dt field that's recorded by default will give the time that the event is received by the EventLogging server. Since we'll be holding events until the page unload, we should have a client_dt or load_dt field that gives the actual time the page was loaded so we can do our sessionizing accurately.

Sure, I can add that. What format do you want the date time in?

ISO-8601 (i.e. 2019-10-02T16:15:30Z), please.

In T238029#5682265, @Neil_P._Quinn_WMF wrote:

[...]
ISO-8601 (i.e. 2019-10-02T16:15:30Z), please.

Done

I've added section_count and opened_section_count. It's a unique count. I think it tells us more about the "depth" of their interaction with the page.

Nice! From the code, it looks like this count is the number of unique sections that the user ever opened, whether or not they're still open when the event is sent. Is that right? (I like that behavior.)

In T238029#5683218, @Neil_P._Quinn_WMF wrote:

I've added section_count and opened_section_count. It's a unique count. I think it tells us more about the "depth" of their interaction with the page.

Nice! From the code, it looks like this count is the number of unique sections that the user ever opened, whether or not they're still open when the event is sent. Is that right? (I like that behavior.)

That's right.

nshahquinn-wmf added a subscriber: AMuigai.

Thinking about sections made me think of another thing: our operating system filter will still catch Android and iOS tablets where the sections are big enough to be opened by default. The experience on these devices isn't really relevant to KaiOS, so I feel like we should use a screen-size filter to limit our data collection to phones only. @AMuigai, @SBisson what do you think?

I don't have any idea right now what exactly the cutoff would be, but if we decide we want to do this, I'll do some research.

@SBisson I think we're planning to use this same schema for KaiOS app events too; if so, we should probably change the os field to something like`platform` and have kaios-web and kaios-app as possible values.

Makes sense to me on both fronts @Neil_P._Quinn_WMF

I think the collection that is heavy on cookies and tracking should have been reviewed by our privacy engineer @JFishback_WMF .

@Neil_P._Quinn_WMF
<T238029+public+623660044037b780@phabricator.wikimedia.org> This is related
to the other ticket T242853 https://phabricator.wikimedia.org/T242853. I
wanted to see:

  1. If your query would show similar results or behaviour to the data we're

seeing in turnilo

  1. Determine if there are month on month fluctuations (given that I am

unable to confirm the numbers on turnilo due to the drop that we're yet to
make sense of).

Change 551259 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] Instrumentation for InukaPageView

https://gerrit.wikimedia.org/r/551259

SBisson added a subscriber: phuedx.

A very special thanks to @phuedx for reviewing and merging the patch!!

This task is now for @nshahquinn-wmf to test and resolve once the events are flowing in successfully and we are confident the data is usable.

Change 566391 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[operations/mediawiki-config@master] Enable InukaPageView instrumentation in labs

https://gerrit.wikimedia.org/r/566391

Change 566391 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable InukaPageView instrumentation in labs

https://gerrit.wikimedia.org/r/566391

@nshahquinn-wmf This is enabled in betalabs. After hacking my GeoIP cookie to make it look like I'm in India, I was able to see the events being sent from the browser. Do you happen to know where the betalabs events end up?

Change 566572 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/extensions/WikimediaEvents@master] InukaPageView: update schema version

https://gerrit.wikimedia.org/r/566572

Change 566572 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] InukaPageView: update schema version

https://gerrit.wikimedia.org/r/566572

Change 566579 had a related patch set uploaded (by Catrope; owner: Sbisson):
[mediawiki/extensions/WikimediaEvents@wmf/1.35.0-wmf.16] InukaPageView: update schema version

https://gerrit.wikimedia.org/r/566579

Change 566579 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@wmf/1.35.0-wmf.16] InukaPageView: update schema version

https://gerrit.wikimedia.org/r/566579

Mentioned in SAL (#wikimedia-operations) [2020-01-22T19:36:32Z] <catrope@deploy1001> Synchronized php-1.35.0-wmf.16/extensions/WikimediaEvents/: InukaPageView: update schema version (T238029) (duration: 01m 07s)

Change 566596 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/extensions/WikimediaEvents@master] InukaPageView: update schema version

https://gerrit.wikimedia.org/r/566596

Change 566596 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] InukaPageView: update schema version

https://gerrit.wikimedia.org/r/566596

Change 566600 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[mediawiki/extensions/WikimediaEvents@wmf/1.35.0-wmf.16] InukaPageView: update schema version

https://gerrit.wikimedia.org/r/566600

@nshahquinn-wmf I was able to test on betalabs. I discovered and fixed two issues: 1) there was a typo in the "client_type" field in the schema, 2) the referring_domain was mandatory but there are valid cases where the is just no referrer so I made it optional.

CentralNotice seem to be installed but window.Geo is not populated. I had to put a breakpoint at the beginning of our script so I can populated it before our script tries to read it. It is populated correctly in production so it should work.

It is populated alright. I deleted my cookies, refreshed, and it's there and working fine.

Change 566600 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@wmf/1.35.0-wmf.16] InukaPageView: update schema version

https://gerrit.wikimedia.org/r/566600

Mentioned in SAL (#wikimedia-operations) [2020-01-23T19:10:31Z] <urbanecm@deploy1001> Synchronized php-1.35.0-wmf.16/extensions/WikimediaMessages/extension.json: SWAT: 23a6f8e: InukaPageView: update schema version (T238029) (duration: 01m 05s)

Thanks for testing on Beta cluster, @SBisson! I see two server log entries here so I'm not sure when exactly all your code will be live; depending on the answer I will check the production data either this week or next (T244174).

Otherwise, I think we can close this!

Change 570381 had a related patch set uploaded (by Sbisson; owner: Sbisson):
[operations/mediawiki-config@master] Enable InukaPageView logging on production Wikipedias

https://gerrit.wikimedia.org/r/570381

Change 570381 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable InukaPageView logging on production Wikipedias

https://gerrit.wikimedia.org/r/570381

Mentioned in SAL (#wikimedia-operations) [2020-02-05T19:19:41Z] <jforrester@deploy1001> Synchronized wmf-config/InitialiseSettings.php: T238029 Enable InukaPageView logging on production Wikipedias (duration: 01m 07s)

This was enabled in production just now.

I'm seeing events flowing into the production database, so I think this is done! I'll do a deeper check next week, when more data is available (T244174).

SELECT COUNT(*) as events
FROM inukapageview
WHERE year > 0

events	
77019

A very special thanks to @phuedx for reviewing and merging the patch!!

You're most welcome. Shout outs to @Krinkle and @Nuria who also provided review and stellar discussion/collation of information around the lifetimes of JS-based cookies.