We need an ability to view usage/impact of the <graph> tags. Graphs could be static (image only), and interactive (user clicks "play" button, at which point it loads javascript and all the needed data).
Request pattern:
^/api/rest_v1/page/graph/([^/]+)/([^/]+)/[^/]+/([^/]+)$
$1 = format (only png for now)
$2 = page name -- hopefully we can remove this later
$3 = graph id
Interesting information:
- good requests vs bad requests (200/304 vs others)
- per language/project usage (en.wikipedia, fr.wiktionary, etc)
- "distinct users" count? (distinct ip+user agent is a good enough approximation I guess)
- distinct graphs count (unique hash per language+project)
Sample hive query:
SELECT project_class, project, hash, COUNT(1) AS n
FROM (
SELECT
normalized_host.project_class AS project_class,
normalized_host.project AS project,
REGEXP_EXTRACT(uri_path, '^/api/rest_v1/page/graph/([^/]+)/([^/]+)/[^/]+/([^/]+)$', 3) AS hash
FROM wmf.webrequest
WHERE
year=2015 AND month=12 AND day=20 AND hour=5
AND webrequest_source = 'text'
AND http_status IN('200','304')
AND uri_path RLIKE '^/api/rest_v1/page/graph/([^/]+)/([^/]+)/[^/]+/([^/]+)$'
) prepared
GROUP BY project_class, project, hash;