Page MenuHomePhabricator

Capture hovercards fetches as previews in analytics
Closed, ResolvedPublic2 Story Points

Description

As a data analyst, I want to be able to easily distinguish HoverCards hovers as preview classified requests in Hive, so that I can ascertain their impact on user interaction with the site.

Acceptance criteria

  • API web requests triggered on a hover will include an HTTP header X-Analytics: preview=1 so that they can be classified in Hive

HoverCards to stable desktop web is tentatively slated for Q4 FY 2015-2016 (April-June, 2016), and this is a hard requirement of that work.

Details

Related Gerrit Patches:

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 9 2016, 11:54 PM
dr0ptp4kt updated the task description. (Show Details)Mar 10 2016, 12:02 AM

(Compare also T128887 for link previews on Android)

Prtksxna added a subscriber: Prtksxna.

In the Android app, add x-analytics request header with value pageview=1 for both RESTBase and MediaWiki php.api endpoints.

Should hovercards do the same?

He7d3r added a subscriber: He7d3r.Mar 10 2016, 10:07 AM
Nuria added a comment.Mar 10 2016, 4:24 PM

In the Android app, add x-analytics request header with value pageview=1 for both RESTBase and MediaWiki php.api endpoints.

Should hovercards do the same?

That seems the simplest alternative. We can update docs when code review is in place.

Sorry I forgot to mention this earlier: One thing that requires investigation if we do header enrichment via JavaScript is whether all ResourceLoader-compatible UAs actually support such client side header enrichment. Does anyone know for a fact?

If we start the practice of enriching X-Analytics by the web client - particularly on such a visible feature - we should consider instituting some sort of JavaScript hook whereby anyone interested in enriching the header client side will do so in a non-destructive manner (i.e., it is additive, there's no possibility of unintentionally wiping someone else's key-value pair in the header, etc.). Unlike the app where there's a small set of developers who have tacit knowledge have the header enrichment piece, on the web there's a relatively larger set of developers and I foresee well intentioned people doing stuff with the header without realizing they might be braking someone else's X-Analytics client side enrichment.

jhobs triaged this task as Medium priority.Mar 10 2016, 6:02 PM
Milimetric moved this task from Incoming to Radar on the Analytics board.Mar 21 2016, 4:18 PM
Jdlrobson updated the task description. (Show Details)Mar 22 2016, 9:18 PM
Jdlrobson updated the task description. (Show Details)
Jdlrobson updated the task description. (Show Details)
Jdlrobson added a subscriber: Jdlrobson.

Can you update the description with what the header name and value needs to be?

@bearND do you have the code samples handy for header enrichment for previews?

@Jdlrobson, @dr0ptp4kt We just add a header like this:

x-analytics: preview=1

This is for both MW API and RESTBase API usage. See the links for actual implementation. Similarly you can find in the same files the ones for pageview=1. This should be promoted to production in the next couple of days.

Thanks, @bearND.

@Jdlrobson, @Nuria can explain further, but essentially they're okay with the client sending the header.

The moment more than one key-value pair starts to be sent for this header, a more deluxe solution will be required to ensure different client side authors don't trump each other's client-initiated header enrichment.

Nuria added a comment.Mar 23 2016, 7:19 PM

Loopin @Tbayer here so he knows what to expect, please do not do any code changes quite yet as we are meeting to talk about this in the next couple days.

Nuria added a comment.Mar 23 2016, 7:20 PM

The moment more than one key-value pair starts to be sent for this header, a more deluxe solution will be required to ensure different client side authors don't trump each other's client-initiated header enrichment.

The header should be formed correctly if we are using the x-analytics extension. It is supposed to hold many keys/values.

Nuria added a comment.Mar 24 2016, 5:44 PM

X-analytics hedaer cobntains values like: "WMF-Last-Access=10-Jan-2016;pageview=1"

Nuria renamed this task from Capture hovers as previews in analytics to Capture hovercards fetches as previews in analytics.Mar 24 2016, 7:18 PM
Nuria added a comment.Mar 24 2016, 7:35 PM

Let me try to clarify here the difference between what we have been calling "preview" requests and content consumption (cc @Tbayer)

Preview requests are not consider pageviews because they are mean to represent "content-user-has-not seen". In this sense "prefetch" would a better term to classify them under. An application might prefetch pages to have them ready for the user to consume but until the user has hover over a link the content is not "consumed".

This "consumption" has to be marked with a request to the server in some fashion otherwise there is no way for the system to know what content was viewed from the one that was prefetched.

It could be that hovercards are implemented such that when user hovers the request is made for the content, in that case the UI will be less performant but there is no difference between content consumed and fetched. If that is the case I am not sure tagging requests for hovercards with "preview" is the best idea.

Hopefully this makes sense. I think the title of the task and the description of it are perhaps at odds and that is why this is a bit confusing. Sorry about that.

It could be that hovercards are implemented such that when user hovers the request is made for the content, in that case the UI will be less performant but there is no difference between content consumed and fetched.

This is the way it works.

If that is the case I am not sure tagging requests for hovercards with "preview" is the best idea.

Would you please elaborate? The Android app uses "preview" for link previews, which behave analogously.

Nuria added a comment.Mar 28 2016, 5:27 PM

Would you please elaborate? The Android app uses "preview" for link previews, which behave analogously.

Indeed. You are correct and I am mistaken. I though android was doing prefetches but it is not.

Then, let's tag (using the x-analytics syntax) requests for hovercards with "preview".

Per our last meeting on this regard: Previews will not be counted as pageviews, we want to separate overall metrics (pageviews) from feature metrics separate. In the case of hovercards given its implementation this distinction is even more important. Having to fetch resources when on hover means that this implementation can probably use some performance improvements and thus its implementation (if broadly used) might change.

Would you please elaborate? The Android app uses "preview" for link previews, which behave analogously.

Indeed. You are correct and I am mistaken. I though android was doing prefetches but it is not.
Then, let's tag (using the x-analytics syntax) requests for hovercards with "preview".
Per our last meeting on this regard:

Which meeting?

Previews will not be counted as pageviews, we want to separate overall metrics (pageviews) from feature metrics separate. In the case of hovercards given its implementation this distinction is even more important. Having to fetch resources when on hover means that this implementation can probably use some performance improvements and thus its implementation (if broadly used) might change.

dr0ptp4kt updated the task description. (Show Details)Apr 4 2016, 4:20 PM
dr0ptp4kt updated the task description. (Show Details)Apr 4 2016, 4:25 PM
dr0ptp4kt set the point value for this task to 2.Apr 4 2016, 4:27 PM

@GWicke: We're hoping to use the RESTBase summary service as the primary backing service for Page-Previews. Does RESTBase's logging pipeline match the standard – I hesitate to use the word – one, i.e. forwarding x-analytics headers etc?

Prtksxna removed a subscriber: Prtksxna.Apr 5 2016, 5:26 AM
GWicke added a comment.EditedApr 11 2016, 7:06 PM

@phuedx, all accesses to RESTBase end points traverse text varnishes, which is where logging happens the same way as for other requests. These logs capture both cache hits and misses, which is important for a very cacheable end point like summary.

@phuedx, all accesses to RESTBase end points traverse text varnishes, which is where logging happens the same way as for other requests. These logs capture both cache hits and misses, which is important for a very cacheable end point like summary.

That's what I was hoping you'd say. Thanks @GWicke.

bmansurov added a comment.EditedApr 13 2016, 6:11 PM

How should we treat the user's DoNotTrack setting in this context? I suppose the header will be sent only when DNT is false? Or maybe not as the data being sent is not about the user.

Change 283253 had a related patch set uploaded (by Bmansurov):
Add X-Analytics request header when fetching popup data

https://gerrit.wikimedia.org/r/283253

How should we treat the user's DoNotTrack setting in this context? I suppose the header will be sent only when DNT is false? Or maybe not as the data being sent is not about the user.

Right, I don't think a special case is warranted here. If in some future state we were to do something like tag in funnel information, then, yes, most definitely we'd need to consider DNT. But we're not at that point just yet.

Good question!

Nuria added a comment.Apr 14 2016, 4:08 AM

How should we treat the user's DoNotTrack setting in this context? I suppose the header will be sent only when DNT is false? Or maybe not as the data being sent is not about the user.

Headers in your ajax request are going to be sent regardless of DNT but as @dr0ptp4kt said api requests do not take DNT into account (yet)

FYI that if DNT is set Eventlogging data is not sent.

Change 283253 merged by jenkins-bot:
Add X-Analytics request header when fetching popup data

https://gerrit.wikimedia.org/r/283253

bmansurov removed bmansurov as the assignee of this task.Apr 14 2016, 12:07 PM
bmansurov added a subscriber: bmansurov.

@dr0ptp4kt over to you for sign off.

phuedx assigned this task to dr0ptp4kt.Apr 15 2016, 10:30 AM

… per the above.

@dr0ptp4kt any page, https://en.wikipedia.org/wiki/John_the_bookmaker_controversy for example. Enable the hovercards beta feature and hover over a link.

Using Developer Tools (Network) in Chrome and Live HTTP Headers in Firefox on the desktop, it didn't seem to show up for some reason. Do you happen to see the same?

GET /w/api.php?action=query&format=json&prop=info%7Cextracts%7Cpageimages%7Crevisions&formatversion=2&redirects=true&exintro=true&exsentences=5&explaintext=true&piprop=thumbnail&pithumbsize=300&rvprop=timestamp&titles=Australia_national_cricket_team&smaxage=300&maxage=300&uselang=content HTTP/1.1
Host: en.wikipedia.org
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Firefox/45.0
Accept: application/json, text/javascript, */*; q=0.01
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
X-Requested-With: XMLHttpRequest
Referer: https://en.wikipedia.org/wiki/Main_Page
Cookie: ...
Connection: keep-alive

@mbinder, when a task is dragged from the backlog into a sprint because the sprint is idling, is it supposed to get the tag "Unplanned-Sprint-Work"?

bmansurov added a comment.EditedApr 16 2016, 10:12 PM

Sorry, I should have given a link to beta labs. Not in production yet.

Confirmed on latest stable Firefox, Chrome, Opera, and Safari desktop UAs client side.

Confirmed in Apache logs deployment-mediawiki01.deployment-prep.eqiad.wmflabs in beta cluster logs. Notice preview=1.

2016-04-18T19:27:05	299604	<removed>	proxy-server/200	252	GET	http://en.wikipedia.beta.wmflabs.org/w/api.php?action=query&format=json&prop=info%7Cextracts%7Cpageimages%7Crevisions&formatversion=2&redirects=true&exintro=true&exsentences=5&explaintext=true&piprop=thumbnail&pithumbsize=300&rvprop=timestamp&titles=Golden_Gate_Bridge&smaxage=300&maxage=300&uselang=content	-	application/json	http://en.wikipedia.beta.wmflabs.org/wiki/San_Francisco	<removed>, 127.0.0.1	Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Firefox/45.0	en-US,en;q=0.5	preview=1	-	<removed>

The next step should be to verify in Hive once this starts rolling on the train. With the scheduled backup failover scenario running this week, I believe that means we'll need to wait until next week to confirm, though.

dr0ptp4kt updated the task description. (Show Details)Apr 19 2016, 5:09 PM
Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptApr 19 2016, 5:09 PM
dr0ptp4kt closed this task as Resolved.Apr 19 2016, 8:36 PM

Signing off.