Page MenuHomePhabricator

Compute average search result placement
Open, LowPublic

Description

The gsc data in Druid/Superset needs a post-aggregation metric that computes the AVG position (currently we just have MIN and MAX – corresponding to best and worst placement, respectively). Use case is if we're looking at specific wiki across multiple countries or multiple wikis in a specific country, the average position is way more useful than min/max.

For reference, the JSON spec for computing CTR – SUM(clicks)/SUM(impressions) – is:

{
    "type"   : "arithmetic",
    "name"   : "ctr",
    "fn"     : "/",
    "fields" : [
       { "type" : "fieldAccess", "name" : "sum__clicks", "fieldName" : "sum__clicks" },
       { "type" : "fieldAccess", "name" : "sum__impressions", "fieldName" : "sum__impressions" }
    ]
}

Remark: the way to do this is to have an aggregation that's a sum of the position metric and then to divide that by the number of rows, which is available already as

{ "type" : "fieldAccess", "name" : "count", "fieldName" : "count" }

Note: calculating the median is impossible in Druid.

Event Timeline

mpopov triaged this task as Medium priority.Aug 8 2018, 1:19 PM
mpopov created this task.
mpopov moved this task from Triage to Next Up on the Product-Analytics board.
mpopov lowered the priority of this task from Medium to Low.Dec 10 2018, 2:51 PM
mpopov moved this task from Next Up to Backlog on the Product-Analytics board.

Updating status to reflect reality.

kzimmerman moved this task from Backlog to Icebox on the Product-Analytics board.
kzimmerman subscribed.

This is not required for the T172581 epic, and we may not do this. Moving to Unprioritized Backlog and removing Mikhail.