Maps Dashboard - add notation for increased tile usage
Closed, ResolvedPublic

Description

There is a large spike in usage on the maps tiles summary dashboard that started on Nov 8, 2016 http://discovery.wmflabs.org/maps/#tiles_summary that we should make a notation for.

The spike appears to be when a Pokemon maps server agent (pkget.com) started using our tiles - about the same time that they appeared to be blocked from using tile.osm.org map tiles.

debt created this task.Jan 5 2017, 11:17 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 5 2017, 11:17 PM

For reference, https://github.com/openstreetmap/chef/commit/dece06b6 is the commit that blocked pkget.com from osm.org, and the timestamp on that is Tue Nov 8 11:52:42 2016 +0000.

mpopov added a comment.Jan 6 2017, 7:33 PM

Yup, that checks out:

library(magrittr)

query <- "SELECT
  CONCAT('2016/11/', day) AS date,
  CASE
    WHEN INSTR(referer, 'pkget.com') > 0 THEN 'pkget'
    WHEN INSTR(referer, 'pkmtracker.com') > 0 THEN 'pkmtracker'
    WHEN INSTR(referer, 'worldpokemap.com') > 0 THEN 'worldpokemap'
    ELSE 'other'
    END AS referrer,
  COUNT(1) AS tiles
FROM wmf.webrequest
WHERE
  webrequest_source = 'maps'
  AND year = 2016 AND month = 11 AND day > 6 AND day < 12
  AND http_status IN('200','304')
  AND uri_path RLIKE '^/([^/]+)/([0-9]{1,2})/(-?[0-9]+)/(-?[0-9]+)(@([0-9]\\.?[0-9]?)x)?\\.([a-z]+)$'
  AND uri_query <> '?loadtesting'
GROUP BY
  CONCAT('2016/11/', day),
  CASE WHEN INSTR(referer, 'pkget.com') > 0 THEN 'pkget' WHEN INSTR(referer, 'pkmtracker.com') > 0 THEN 'pkmtracker' WHEN INSTR(referer, 'worldpokemap.com') > 0 THEN 'worldpokemap' ELSE 'other' END
ORDER BY date ASC, tiles DESC
LIMIT 100;"

x <- readr::read_csv("~/Downloads/query_result.csv")
x$date <- lubridate::ymd(x$date)
y <- x %>%
  dplyr::group_by(date) %>%
  dplyr::mutate(prop = tiles/sum(tiles)) %>%
  dplyr::ungroup() %>%
  dplyr::select(-tiles) %>%
  tidyr::spread(referrer, prop, fill = 0) %>%
  dplyr::mutate(
    other_y = (1-pkget)/2 + pkget,
    pkget_y = pkget/2
  ) %>%
  dplyr::select(-c(other, pkget)) %>%
  dplyr::rename(other = other_y, pkget = pkget_y) %>%
  tidyr::gather("referrer", "y", -date) %>%
  dplyr::left_join(x)

foo <- function(x) {
  return(sub("NANA tiles", "", x, fixed = TRUE))
}

library(ggplot2)
ggplot(x, aes(x = date, y = tiles, fill = referrer)) +
  geom_bar(position = "fill", stat = "identity") +
  scale_y_continuous("Proportion of tiles served") +
  theme_minimal() +
  scale_fill_brewer(palette = "Set1") +
  labs(title = "Kartotherian usage by Pokemon Go-related site Pkget",
       subtitle = "Pkget was blocked from OpenStreetMap on November 8th") +
  theme(legend.position = "bottom") +
  geom_text(aes(x = date, y = y, label = foo(paste(polloi::compress(tiles), "tiles"))), data = y, color = "white")

Change 330980 had a related patch set uploaded (by Bearloga):
Note pkget tile usage

https://gerrit.wikimedia.org/r/330980

Change 330980 merged by Bearloga:
Note pkget tile usage

https://gerrit.wikimedia.org/r/330980

Change 330981 had a related patch set uploaded (by Bearloga):
Add pkget tile usage note to Maps dash

https://gerrit.wikimedia.org/r/330981

Change 330981 merged by Bearloga:
Add pkget tile usage note to Maps dash

https://gerrit.wikimedia.org/r/330981

debt closed this task as "Resolved".Jan 9 2017, 7:08 PM
debt claimed this task.

Thanks, @mpopov and @Pnorman !