Page MenuHomePhabricator

Search results page: how many visitors are on mobile vs desktop
Closed, ResolvedPublic2 Estimated Story Points

Description

Let's take a look at our event logging over the last few months to determine how many visitors enter a query and end up on the search engine results page (serp) are using a mobile device or desktop.

This doesn't need to be a full analysis paper - just a brief look at the data and reveal the results in this ticket.

This will probably be useful: https://meta.wikimedia.org/wiki/Schema:MobileWebSearch

The results of this review of the data will impact what we're doing with the new search results page and auto-suggestion on mobile.

Event Timeline

debt triaged this task as Medium priority.Oct 5 2016, 8:27 PM
debt updated the task description. (Show Details)

This may also be possible to infer from the request logs, given that it's obvious from the URL whether the person is on mobile or not.

mpopov set the point value for this task to 2.
mpopov moved this task from Backlog to In progress on the Discovery-Analysis (Current work) board.

Running the following Hive query for comparing desktop vs mobile web SERPs for the past 2 or so weeks:

SELECT
  date, platform,
  COUNT(1) AS n_serps,
  COUNT(DISTINCT(uuid)) AS n_users
FROM (
  SELECT
    TO_DATE(ts) AS date,
    access_method AS platform,
    CONCAT(client_ip, user_agent, accept_language) AS uuid
  FROM wmf.webrequest
  WHERE
    webrequest_source IN('text')
    AND year = 2016 AND ((month = 10 AND day < 7) OR (month = 9 AND day > 20))
    AND http_status IN('200', '304')
    AND INSTR(uri_path, 'index.php') > 0
    AND INSTR(uri_query, '?search=') > 0
) AS webrequest_subset
GROUP BY date, platform;

Will post some numbers and charts when it's done :)

Summaries

platformmedian SERPs a daymedian users a day
desktop2.38M1.08M
mobile web1.15M666.03K

Figures

n_serps.png (400×1 px, 55 KB)

n_users.png (400×1 px, 62 KB)

Data & Code

library(tidyverse)

requests <- read_csv("search_desktop-vs-mobile.csv", col_types = "Dcii")

requests %>%
  group_by(platform) %>%
  summarize(`median SERPs a day` = polloi::compress(median(n_serps)),
            `median users a day` = polloi::compress(median(n_users))) %>%
  knitr::kable(format = "markdown", align = c("l", "r", "r"))

ggplot(requests, aes(x = date, y = n_serps/1e6, color = platform)) +
  geom_line() +
  geom_point() +
  scale_x_date(date_labels = "%a\n%d %b", date_breaks = "2 days") +
  theme_minimal() +
  labs(x = "Date", y = "SERPs (in millions)",
       title = "Number of search engine result pages (SERPs) by platform") +
  theme(legend.position = "bottom")

ggplot(requests, aes(x = date, y = n_users, color = platform)) +
  geom_line() +
  geom_point() +
  scale_y_continuous(labels = polloi::compress, breaks = seq(0, 1.2e6, 1e5)) +
  scale_x_date(date_labels = "%a\n%d %b", date_breaks = "2 days") +
  theme_minimal() +
  labs(x = "Date", y = "Users",
       title = "Number of unique search users by platform",
       subtitle = "Identifying a user by their user agent + IP address + Accept-Language tuple") +
  theme(legend.position = "bottom")