Page MenuHomePhabricator

Search results page: how many visitors are on mobile vs desktop
Closed, ResolvedPublic2 Story Points

Description

Let's take a look at our event logging over the last few months to determine how many visitors enter a query and end up on the search engine results page (serp) are using a mobile device or desktop.

This doesn't need to be a full analysis paper - just a brief look at the data and reveal the results in this ticket.

This will probably be useful: https://meta.wikimedia.org/wiki/Schema:MobileWebSearch

The results of this review of the data will impact what we're doing with the new search results page and auto-suggestion on mobile.

Event Timeline

debt created this task.Oct 5 2016, 8:26 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 5 2016, 8:26 PM
debt triaged this task as Normal priority.Oct 5 2016, 8:27 PM
debt updated the task description. (Show Details)
Deskana added a subscriber: Deskana.Oct 5 2016, 9:25 PM

This may also be possible to infer from the request logs, given that it's obvious from the URL whether the person is on mobile or not.

mpopov claimed this task.Oct 7 2016, 10:35 PM
mpopov set the point value for this task to 2.
mpopov moved this task from Backlog to In progress on the Discovery-Analysis (Current work) board.

Running the following Hive query for comparing desktop vs mobile web SERPs for the past 2 or so weeks:

SELECT
  date, platform,
  COUNT(1) AS n_serps,
  COUNT(DISTINCT(uuid)) AS n_users
FROM (
  SELECT
    TO_DATE(ts) AS date,
    access_method AS platform,
    CONCAT(client_ip, user_agent, accept_language) AS uuid
  FROM wmf.webrequest
  WHERE
    webrequest_source IN('text')
    AND year = 2016 AND ((month = 10 AND day < 7) OR (month = 9 AND day > 20))
    AND http_status IN('200', '304')
    AND INSTR(uri_path, 'index.php') > 0
    AND INSTR(uri_query, '?search=') > 0
) AS webrequest_subset
GROUP BY date, platform;

Will post some numbers and charts when it's done :)

debt added a subscriber: Jdrewniak.Oct 11 2016, 4:31 PM

Summaries

platformmedian SERPs a daymedian users a day
desktop2.38M1.08M
mobile web1.15M666.03K

Figures

Data & Code

library(tidyverse)

requests <- read_csv("search_desktop-vs-mobile.csv", col_types = "Dcii")

requests %>%
  group_by(platform) %>%
  summarize(`median SERPs a day` = polloi::compress(median(n_serps)),
            `median users a day` = polloi::compress(median(n_users))) %>%
  knitr::kable(format = "markdown", align = c("l", "r", "r"))

ggplot(requests, aes(x = date, y = n_serps/1e6, color = platform)) +
  geom_line() +
  geom_point() +
  scale_x_date(date_labels = "%a\n%d %b", date_breaks = "2 days") +
  theme_minimal() +
  labs(x = "Date", y = "SERPs (in millions)",
       title = "Number of search engine result pages (SERPs) by platform") +
  theme(legend.position = "bottom")

ggplot(requests, aes(x = date, y = n_users, color = platform)) +
  geom_line() +
  geom_point() +
  scale_y_continuous(labels = polloi::compress, breaks = seq(0, 1.2e6, 1e5)) +
  scale_x_date(date_labels = "%a\n%d %b", date_breaks = "2 days") +
  theme_minimal() +
  labs(x = "Date", y = "Users",
       title = "Number of unique search users by platform",
       subtitle = "Identifying a user by their user agent + IP address + Accept-Language tuple") +
  theme(legend.position = "bottom")
debt closed this task as Resolved.Oct 12 2016, 7:23 PM

Thanks! :)