Page MenuHomePhabricator

Update event logging used for Hovercards
Closed, ResolvedPublic8 Estimated Story Points

Description

As a data analyst, I want to understand link interaction without and with HoverCards so that I can help inform decisions regarding tuning and rollout of HoverCards.

Acceptance criteria:

We're updating https://meta.wikimedia.org/wiki/Schema:Popups to capture all relevant EventLogging data for this task.

Some specs:

  • Sampling in for event logging should be done on session-wide funnel (mw.user.sessionid()).

[?x] Specify sampling rate on a per-wiki basis. Use the default sampling rate otherwise.

  • Only sample in to log when UA is sendBeacon capable. Do not log when UA is not sendBeacon capable.
  • State machine is demarcated for determining session-, page-wide funnel and per-link-dwell subfunnels. A few gotchas: beware double clicks and clearly determine behavior for back and forward button navigation with respect to creation of page tokens.

[ x] Only fire link related events for links which would be eligible for a preview

Add these items to the event logging funnel:

  • Page loaded event w/ HoverCards disposition
  • Initial dwell 250ms or longer, but abandoned (and basic reason attached). (Non-abandoned is captured already.)

When logging events, in addition to existing event logging, do these things:

  • Record user edit count bucket and whether logged in / anonymous user (like https://meta.wikimedia.org/wiki/Schema:QuickSurveysResponses)
  • Hover bucket count (this value should also always be incremented even if the user isn't EL sampled; model using the approach from language switching event logging - 0, 1-4, 5-20, 21+) - increment only upon preview actually shown (we'll also need to update https://wikimediafoundation.org/wiki/Cookie_statement#3._What_types_of_cookies_does_Wikimedia_use.3F and https://wikimediafoundation.org/wiki/Privacy_policy/FAQ#Can_you_give_me_some_examples_of_types_of_cookies_and_how_you_use_local_storage.3F)
  • Session funnel token (mw.user.sessionid()) and mechanism to ensure funnel analysis across multiple pages (but not past browser restarts).
  • Page loaded token
  • Link interaction token
  • Destination namespaceid or equivalent
  • Source namespaceid or equivalent
  • Source Pageid
  • Amount of time between hover and the time it actually rendered (determined exclusively client side)
  • If errored out or timed out instead of rendering
  • What was clicked: destination|settings cog
  • HoverCards disabled by user events. The instrumentation ought to be setup such that it's possible to reconstruct how dwell delay / and time between initiation of dwell and actual render play into hover length and disablement of the feature.
  • Timing of clickthroughs (or error; granted sometimes errors means EL won't even work, but it's conceivable there's a service outage for the specific endpoint)
  • Versioning (which Hovercards UX?) with flexibility for A/B/C tests
  • API endpoint: mwapi or restbase (signoff note: presently only mwapi is available, and is thus hardcoded; support for RestBase would need to add the appropriate support)
  • Evaluate the difficulty of excluding mobile touch UAs (because they don't have hover capabilities) from sending events, or alternatively estimate the error introduced by including them (signoff note: mobile devices in particular turned out to be relatively small portion of desktop domain traffic)

See also T88166 (task from 2015 with discussions that led to the present version of the schema)

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
bmansurov subscribed.

Please review the patch above and someone else pick up this task. And don't forget to base your new patch off of the above patch.

Change 289065 merged by jenkins-bot:
Add properties that will be logged with each EL request

https://gerrit.wikimedia.org/r/289065

Jdlrobson updated the task description. (Show Details)

@dr0ptp4kt can you confirm that all the above checklist items are captured by the more simplified acceptance criteria of:

?

@Jdlrobson Yes. If you like, you can replace the stuff in the Description starting from "Add these items to the event logging funnel:" and below with that simplified criteria.

Following up on the exclusion of mobile touch UAs, the following has not been resolved yet:

...

@Tbayer, the rationale I recall is it's the notional aggregate data that will be of interest. Your question is a good one. As a rough proxy I think we could query Hive with an rlike operator reusing the VCL mobile detection applied to, for example NS 0 pageview on en.wikipedia.org and see if it amounts to much.

Has anyone done this in the meantime? If not, where can one find that VCL code (and is it straighforward to convert it to a HQL operator)?

Meanwhile, @jhobs, if you've already confined Hovercards EL to exclude touch only UAs (or if the code already does so), great - please let us know. If not, is the work to do so significant (cf. http://stackoverflow.com/questions/4817029/whats-the-best-way-to-detect-a-touch-screen-device-using-javascript) or is there an existing function for as much? I suppose one approach is copying the mobile detection UA regex (exists in MF PHP plus Varnish VCL) into the JS within the extension.

It seems this question hasn't been answered yet. Perhaps @bmansurov or @Jhernandez have thoughts on this?

I don't know of any existing mobile detection function. Are we worried that the desktop site will be shown to devices that don't have a hover capability?

I don't know of any existing mobile detection function.

We have one on operating on the Varnishes of course - that's what Adam's VCL remark was about (and my followup question on where to find the code for it).

Are we worried that the desktop site will be shown to devices that don't have a hover capability?

Yes, that's basically the concern. Among other things, if a substantial part of the test group actually aren't seeing Hovercards anyway because of device capability, we would be underestimating the effects that Hovercards have on reader behavior.

@bmansurov I'm getting the following errors on master:
load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:13 [Popups] Undeclared property "namespaceIdHover"(anonymous function) @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:13handler @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:155fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:45fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:45self.fireWith @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:46self.fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:46mw.track @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:155mw.popups.render.closePopup @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:39(anonymous function) @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:40fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:45self.fireWith @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:46deferred.(anonymous function) @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:47(anonymous function) @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:40
load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:13 [Popups] Undeclared property "editCountBucket"(anonymous function) @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:13handler @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:155fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:45fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:45self.fireWith @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:46self.fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:46mw.track @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:155mw.popups.render.closePopup @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:39(anonymous function) @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:40fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:45self.fireWith @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:46deferred.(anonymous function) @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:47(anonymous function) @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:40
load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:13 [Popups] Undeclared property "isAnon"(anonymous function) @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:13handler @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:155fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:45fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:45self.fireWith @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:46self.fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:46mw.track @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:155mw.popups.render.closePopup @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:39(anonymous function) @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:40fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:45self.fireWith @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:46deferred.(anonymous function) @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:47(anonymous function) @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:40
load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:13 [Popups] Undeclared property "pageIdSource"(anonymous function) @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:13handler @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:155fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:45fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:45self.fireWith @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:46self.fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:46mw.track @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:155mw.popups.render.closePopup @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:39(anonymous function) @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:40fire @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:45self.fireWith @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:46deferred.(anonymous function) @ load.php?debug=false&lang=en&modules=jquery%2Cmediawiki&only=scripts&skin=vector&version=GiUT07Vz:47(anonymous function) @ load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:40
load.php?debug=false&lang=en&modules=ext.echo.api%2Cinit|ext.eventLogging|ext.eventLogging.subscrib…:13 [Popups] Undeclared property "namespaceIdSource"

Should we revert "Add properties that will be logged with each EL request" ?
This seems to be the culprit.
cc @phuedx

@Jdlrobson: @bmansurov is working on a fix for the bug so I think we can hold off for a few hours…

Change 289459 had a related patch set uploaded (by Bmansurov):
Do not send new schema values just yet

https://gerrit.wikimedia.org/r/289459

^ Fixes the issue Jon's reported.

Change 289459 abandoned by Jdlrobson:
Do not send new schema values just yet

Reason:
Thanks for prompt update Baha. I'm reverting - https://gerrit.wikimedia.org/r/289495 and then I'll re-revert so we dont't lose and can build on the good work you've done.

https://gerrit.wikimedia.org/r/289459

Change 289553 had a related patch set uploaded (by Bmansurov):
WIP: Switch to Schema:Popups revid 15597282

https://gerrit.wikimedia.org/r/289553

Tilman, I apologize for the delayed response about the VCL.

The Varnish is https://github.com/wikimedia/operations-puppet/blob/production/templates/varnish/text-frontend.inc.vcl.erb#L21.

I was quarreling with Hive over CASE and IF statements today, so just went simple with the following queries.

Using 20160517 (but not using the WML Accept type condition)

  • On Hungarian desktop Wikipedia it looks like the mobilish UA pageviews are about 0.34% of total.
  • Globally on desktop Wikipedias it looks like the mobilish UA pageviews are about 0.7% of total.

This probably suggests mobile UAs would be a negligible factor in calculations with the event logging schema on this task.

Caveat: Again, the WML-bearing /Accept// header condition isn't factored in here, as I don't see it as part of Hive. So that could push the figure up somewhat, but I doubt by much. Anecdotally that was fading already and the bigger regex pattern captures some, if not most, clients that historically would have theoretically supported WML. WML has of course been disabled on the servers because of low prevalance, although that's a separate matter.

users accessing Hungarian desktop Wikipedia, mobilish

select count(1)
from webrequest
where
year = 2016
and month = 5
and day = 17
and normalized_host.project_class = 'wikipedia'
and normalized_host.project = 'hu'
and access_method = 'desktop'
and agent_type = 'user'
and is_pageview
and (
user_agent rlike '(?i)(mobi|240x240|240x320|320x320|alcatel|android|audiovox|bada|benq|blackberry|cdm-|compal-|docomo|ericsson|hiptop|htc[-_]|huawei|ipod|kddi-|kindle|meego|midp|mitsu|mmp\/|mot-|motor|ngm_|nintendo|opera.m|palm|panasonic|philips|phone|playstation|portalmmm|sagem-|samsung|sanyo|sec-|semc-browser|sendo|sharp|silk|softbank|symbian|teleca|up.browser|vodafone|webos)'
or
user_agent rlike '^(?i)(lge?|sie|nec|sgh|pg)-'
);

3547

users accessing Hungarian desktop Wikipedia, general

select count(1)
from webrequest
where
year = 2016
and month = 5
and day = 17
and normalized_host.project_class = 'wikipedia'
and normalized_host.project = 'hu'
and access_method = 'desktop'
and agent_type = 'user'
and is_pageview;

1054149


users accessing desktop Wikipedia, mobilish

select count(1)
from webrequest
where
year = 2016
and month = 5
and day = 17
and normalized_host.project_class = 'wikipedia'
and access_method = 'desktop'
and agent_type = 'user'
and is_pageview
and (
user_agent rlike '(?i)(mobi|240x240|240x320|320x320|alcatel|android|audiovox|bada|benq|blackberry|cdm-|compal-|docomo|ericsson|hiptop|htc[-_]|huawei|ipod|kddi-|kindle|meego|midp|mitsu|mmp\/|mot-|motor|ngm_|nintendo|opera.m|palm|panasonic|philips|phone|playstation|portalmmm|sagem-|samsung|sanyo|sec-|semc-browser|sendo|sharp|silk|softbank|symbian|teleca|up.browser|vodafone|webos)'
or
user_agent rlike '^(?i)(lge?|sie|nec|sgh|pg)-'
);

2019935


users accessing desktop Wikipedia, general

select count(1)
from webrequest
where
year = 2016
and month = 5
and day = 17
and normalized_host.project_class = 'wikipedia'
and access_method = 'desktop'
and agent_type = 'user'
and is_pageview;

308694234

@dr0ptp4kt Great, thanks! I agree that at these levels, it is a reasonable simplification.

Thanks also for digging out these mobile detection regexes, I might reuse that for other questions.

@dr0ptp4kt @Tbayer I noticed the duration property has been removed in the new version of the schema. Is that intentional? I didn't find any conversation around it. The current schema is here.

@dr0ptp4kt @Tbayer I noticed the duration property has been removed in the new version of the schema. Is that intentional? I didn't find any conversation around it. The current schema is here.

As remarked in this edit summary, it was removed based on the assumption that it is redundant because duration = totalInteractionTime - perceivedWait. (That's still true, right?)

That rings a bell. @bmansurov, good for you? If it's a problem, please amend (e.g, reinstating so you don't have to delete the old code that collected and sent duration) as needed, retaining the new totalInteractionTime and perceivedWait.

Thanks, both.

... duration = totalInteractionTime - perceivedWait. (That's still true, right?)

Yes

I think I'll delete the code that calculated duration. It was a hack anyway.

@dr0ptp4kt @Tbayer Should we log "opened in a new/same tab/window" actions only when the user clicks on the hovercard or should we send those actions when the user clicks on the link itself too?

@bmansurov, yes, please log those on link clicks, too.

I'm working from the assumption you'll likely to bind to something like mw.popups.getAction for that link click action. Presently event.which === 3 (alternate/right click, usually right mouse) would implicitly be assumed to be "opened in same tab", which seems reasonable enough. Formally speaking, the user could take any sort of action from there. @bmansurov, @Tbayer please speak up if you think it's necessary to add yet another field to action such as alternateclick (and if so, in some follow on task as a matter of protocol?).

@dr0ptp4kt yes, looks like a right-click (two-finger click) is not being handled. This kind of a click would open a context menu so things will get complicated at this stage. I don't know if we should care in such cases. I think it maybe a good idea to record the right click though.

@bmansurov, is that something you'd do in this task or something that should be done as a follow on task?

Probably as a follow on task because the current patch has become big already.

I think the patch is in a somewhat OK shape. Please review. I'll add tests in a follow up patch. I know it's backwards but given the time constraints I wanted to get the patch out first.

@dr0ptp4kt thanks, I was going to write something similar too :)

@Tbayer @JKatzWMF: I and @dr0ptp4kt had a conversation about not logging events after a click event is recorded. For example, a user may hover over a link and middle click it thus triggering "opened in a new tab", then the user may proceed to clicking on the settings cog icon. This last event won't be recorded as it will simplify the analysis and recording. One exception to this rule is when the user disables Popups. By definition, it takes two clicks to disable Popups - clicking on the settings cog and the saving the disabled state.

Hopefully the reasoning is clear. Let me know if I can explain better.

Adding tests that cover every new functionality is proving to be very time consuming. For that reason manual testing is highly encouraged. Here is a spreadsheet that you can use to verify whether events are being logged correctly.

We can always keep adding tests after iteratively.

Change 289553 merged by jenkins-bot:
Switch to Schema:Popups revid 15597282

https://gerrit.wikimedia.org/r/289553

Where can this be tested and how?

We'll have to set it up on staging with the sampling rate of 1 for resting. I'll post the link here when I'm done.

@dr0ptp4kt you can test the feature by going to http://reading-web-staging.wmflabs.org/w/index.php?title=Main_Page&mobileaction=toggle_view_desktop (and enabling Popups if not already enabled - you need to login too). In the web console network tab look for requests to event.gif which has the data that's being logged. Btw, there is a link to a test page at the bottom of the main page, hovering over that link should display a popup too.

Thanks @bmansurov. At a very quick glance, much of it seems to be working. One thing I noticed, though, is that while pageLoaded is being logged while the Hovercards beta feature is enabled yet the user has used the cog to disable previews, pageLoaded events are firing as expected, but dwellButAbandoned and opened in {new tab|new window|same tab} events aren't being sent. Would you please look into that and patch?

Steps to reproduce:

  1. Login at http://reading-web-staging.wmflabs.org
  2. Go to Preferences > Beta features and check the checkbox for Hovercards
  3. On the Main Page, dwell on the link "Test" > Settings cog > Disable previews > Save
  4. Notice the pageLoaded action is recorded, but subsequent events on the particular target (i.e., dwell abandonment post 250ms dwell, plus clickthroughs) are missing.

@dr0ptp4kt opened in ... actions are firing, only dwelledButAbandoned isn't.

Note to future self: I restarted FF and it started showing logging in the network tab in the opened in ... stuff. Chrome also was showing logging on those in the first case. @bmansurov is working on dwelledButAbandoned.

Change 290749 had a related patch set uploaded (by Bmansurov):
Send dwelledButAbandoned action for links when popups are turned off

https://gerrit.wikimedia.org/r/290749

Could we add subtasks for new work that's unplanned and come out of this work? This task is already a behemoth and it's going to be hard to follow the narrative.

This was expected, but missing functionality. Other stuff, though, yes, agreed if there are any new specs.

@dr0ptp4kt Have a look, the dwellButAbandoned should be sent now properly when hovercards are disabled.

Change 290749 merged by jenkins-bot:
Send dwelledButAbandoned action for links when popups are not enabled

https://gerrit.wikimedia.org/r/290749

I'm only seeing dwelledButAbandoned when Hovercards are enabled (i.e., moving the mouse away from the link after 250ms but before the preview renders).

Although the pageLoaded and opened in... events are firing when Hovercards is disabled, I'm still not seeing dwelledButAbandoned events. 🐪

I'm trying with both Firefox and Chrome with DNT turned off from non-private, private, cache cleared, cache not cleared, no luck.

I've additionally tried dwelling on the link for various amounts of time thinking maybe there was timing thing I could uncover, but no luck there either.

@Jhernandez, @bmansurov, able to reproduce? You seeing the dwelledButAbandoned when Hovercards disabled? I gathered as much based on the submitted patch and comments, but was wondering if you could screencap your flow? I could hop on for a screenshare if that helps, although not sure how our schedules will line up today.

Following up from IRC: reading-web-staging just needed to be updated. Back to testing...

As far as I can tell all the acceptance criteria are met. but I'm going to take one more pass at this tomorrow.