Currently the spark job sending restbase data to graphite only counts /api/rest_v1/% patterns.
This task is to add /w/api.php% pattern as another datapoint.
Description
Details
Event Timeline
This metric would perfectly complement the REST equivalent in https://grafana.wikimedia.org/dashboard/db/api-summary?orgId=1, and as a result give us direct information on overall API use.
Check in which webrequext_source partition data lives (and in the mean time check for api/rest too):
SELECT webrequest_source, (uri_path LIKE '/api/rest_v1/%') AS rest, (uri_path LIKE '/w/api.php%') AS action, COUNT(1) as c FROM wmf.webrequest WHERE year = 2017 AND month = 11 AND day = 20 AND hour = 17 GROUP BY webrequest_source, (uri_path LIKE '/api/rest_v1/%'), (uri_path LIKE '/w/api.php%') ORDER BY webrequest_source, rest, action LIMIT 1000; webrequest_source rest action c misc false false 799279 misc false true 3 text false false 201987329 text false true 33248371 text true false 41854561 upload false false 209598372 upload false true 109 upload true false 5
As expected, we should only consider text partition.
s/varnish_request/varnish_requests/ but otherwise the proposal looks sensible to me, +1.
Change 392700 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery/source@master] Grow RestbaseMetrics spark job to count MW API
Change 392703 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery@master] Change restbase job to also count MW-API requests
I'd recommend against this. The MediaWiki. prefix is prepended by MediaWiki core to all its statsd messages. I would not expect it to be used by other metric writers. Doing so would make it more likely to cause conflicts, and would also make it more difficult to find or document the source of certain metrics. Perhaps use a prefix relating to the source of the data (e.g. Spark, Oozie, Refinery, or some such).
As far as I am aware, for those with access to the graphite box it is fairly trivial to move and merge metrics.
"analytics.varnish_requests.restbase and analytics.varnish_requests.mw_api" look the best to me.
I don't imagine it would be too hard to update these graphs.
If it is I would vote for leaving the restbase stuff where it is for now but having the mw_api stuff @ "analytics.varnish_requests.mw_api" and keeping any new stuff there.
I'm a bit confused here. Is this task about adding a count for the action API, turning an existing restbase-only count into a restbase+action API count, or adding a restbase+action API count in addition to an existing restbase-only count?
I think the idea is to add a count for requests to action API on varnish level. So in the end we will have 2 separate metrics - one for action API and one for rest API.
@Pchelolo is correct, idea is to have both restbase and mw-action-api hourly varnish-requests counts in graphite.
Last patch uses analytics.mw_api.varnish_request as matric name for action-api.
Thanks @Krinkle for the reivew :)
Change 392703 merged by Joal:
[analytics/refinery@master] Change restbase job to also count mw_api requests
Change 392700 merged by jenkins-bot:
[analytics/refinery/source@master] Grow RestbaseMetrics spark job to count MW API
I guess that has been done since I was able to add the Action API graph to the API summary dashboard: https://grafana.wikimedia.org/dashboard/db/api-summary?panelId=1&fullscreen&orgId=1
Should we resolve the ticket?
@Pchelolo: It has indeed happen.
The tak has been moved to done on our kanban, we'll resolve it after we finalize the discussion :)
Thanks !