Track number of article placeholder clicks from search results
Closed, ResolvedPublic5 Story Points

Description

To see how many of our clicks come through Special:Search and not via direct links, search engines (in the future), etc.

We could either track clicks on the links via JS, or track it server side by using a ref=search GET parameter. It might also work to analyze the referrer data already available.

hoo created this task.Aug 14 2016, 9:57 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 14 2016, 9:57 PM
Addshore set the point value for this task to 5.
Lucie moved this task from Incoming to To Do Next on the ArticlePlaceholder board.Aug 15 2016, 1:17 PM

@hoo @Lucie so I just took a quick look at the webrequest data and getting the numbers via that means will be fine and would provide us with daily data. An example entry can be seen below..

{"http_method":"GET","uri_host":"cy.wikipedia.org","uri_path":"/wiki/Arbennig:AboutTopic/Q432087","uri_query":"""referer":"https://cy.wikipedia.org/w/index.php?search=chris+ryan&title=Arbennig:Search&go=Go&searchToken=eo5w649tpumdso7dk6y9l06o5","x_analytics":"ns=-1;special=AboutTopic;loggedIn=1;WMF-Last-Access=15-Aug-2016;https=1"}

As we already do we can identify when the AboutTopic special page has been hit by using the 'special' element of the x_analytics header which would be 'AboutTopic' for the page in all languages (alongside a ns of -1)

Matching Special:Search from the referrer will be slightly harder, as we can not use the namespace to match the string as it is different in different languages (and there is no sort of normalisation done (we could of course come up with some crazy normalisation)
The parameters in the URL however will always be there, so we could match referers that include a search= param and maybe a searchToken? and also make sure that the referer hostname is the same as the uri hostname?

Thoughts?

hoo added a comment.Aug 15 2016, 6:00 PM

Referrer has same host name and contains [?&]search and maybe a title with a colon sounds fine with me and would probably be robust against changes in Special:Search.

So something like this looks pretty solid.

SELECT * from wmf.webrequest
WHERE year = 2016
    AND month = 08
    AND day = 20
    AND webrequest_source = 'text'
    AND x_analytics_map["ns"] = '-1'
    AND x_analytics_map["special"] = 'AboutTopic'
    AND normalized_host.project_class = 'wikipedia'
    AND referer rlike '^.*search=.*$'
    AND referer rlike '^.*wikipedia\.org.*$'

Change 305989 had a related patch set uploaded (by Addshore):
WikidataArticlePlaceholderMetrics also send search referral data

https://gerrit.wikimedia.org/r/305989

Addshore moved this task from Backlog to In Progress on the User-Addshore board.Aug 22 2016, 11:52 AM
Lucie moved this task from To Do Next to Review on the ArticlePlaceholder board.Aug 23 2016, 10:56 AM
thiemowmde triaged this task as Normal priority.Sep 5 2016, 2:53 PM

Change 305989 merged by jenkins-bot:
WikidataArticlePlaceholderMetrics also send search referral data

https://gerrit.wikimedia.org/r/305989

This is now just pending a deployment

Addshore closed this task as Resolved.Dec 5 2016, 11:33 AM
Addshore moved this task from Needs Review to Done / Closed on the User-Addshore board.
Addshore moved this task from Done to Demoed on the WMDE-QWERTY-Team-Board board.Dec 6 2016, 3:12 PM
hoo moved this task from Review to Done on the ArticlePlaceholder board.Dec 13 2016, 10:07 AM