Page MenuHomePhabricator

Add <graph> usage to Grafana
Open, Needs TriagePublic

Description

We need an ability to view usage/impact of the <graph> tags. Graphs could be static (image only), and interactive (user clicks "play" button, at which point it loads javascript and all the needed data).

Request pattern:
^/api/rest_v1/page/graph/([^/]+)/([^/]+)/[^/]+/([^/]+)$
$1 = format (only png for now)
$2 = page name -- hopefully we can remove this later
$3 = graph id

Interesting information:

  • good requests vs bad requests (200/304 vs others)
  • per language/project usage (en.wikipedia, fr.wiktionary, etc)
  • "distinct users" count? (distinct ip+user agent is a good enough approximation I guess)
  • distinct graphs count (unique hash per language+project)

Sample hive query:

SELECT project_class, project, hash, COUNT(1) AS n
FROM (
    SELECT
        normalized_host.project_class AS project_class,
        normalized_host.project AS project,
        REGEXP_EXTRACT(uri_path, '^/api/rest_v1/page/graph/([^/]+)/([^/]+)/[^/]+/([^/]+)$', 3) AS hash
    FROM wmf.webrequest
    WHERE
        year=2015 AND month=12 AND day=20 AND hour=5
        AND webrequest_source = 'text'
        AND http_status IN('200','304')
        AND uri_path RLIKE '^/api/rest_v1/page/graph/([^/]+)/([^/]+)/[^/]+/([^/]+)$'
    ) prepared
GROUP BY project_class, project, hash;

Event Timeline

Yurik raised the priority of this task from to Needs Triage.
Yurik updated the task description. (Show Details)
Yurik added subscribers: Yurik, madhuvishy.
Yurik set Security to None.
Nuria added a subscriber: Nuria.
  • "distinct users" count? (distinct ip+user agent is a good enough approximation I guess)

This is not correct for the most part.

Rather than having to use regex can't you add to your x -analytics headers in the extension something like 'graph pageview' so those come tagged in and we just need to pull them out.

@Nuria, we'll need more general path matching in any case, so that we can keep tabs on which cached REST API end points get used.

I do have some code that turns a Swagger spec into a regexp matcher. Generating the matches from a spec avoids overlaps / over-counting & the need to manually maintain regexps in sync with specs.

Edit: Created T122245: REST API entry point web request statistics at the Varnish level for REST API request metrics.

So there are two cases - static graph (image generated by Graphoid service and requested via <img href='...'> tag, and live graph - user clicks on the static graph to make it interactive. For the static case, we will have to use above regex. That will tell us how many "graph pageviews" people had. For the second case, there is an XHR request being made to get the graph specification via graph api. I could add x-analytics: graph=play header.

@Ottomata suggested we use the statsv approach to report usage. I can still add some X-Analytics value to the XHR request for additional counting.

Python code to get eventlogging_GeoFeatures stats - https://gist.github.com/ottomata/f67e25f8c57b3c20cebd