Page MenuHomePhabricator

Create a metric for overall RESTBase request rates from Varnish logs {hawk} [13 pts]
Closed, ResolvedPublic

Description

Right now we don't allow Varnishes to cache any content, but we plan to start allowing this soon. At that point, internal RESTBase metrics like http://grafana.wikimedia.org/#/dashboard/db/restbase?panelId=8&fullscreen will only show the cache misses. For our purposes it would be super useful to keep track of total requests matching /api/rest_v1/. This will let us track overall API usage, which is going to be our primary KPI for now.

Event Timeline

GWicke raised the priority of this task from to Needs Triage.
GWicke updated the task description. (Show Details)
GWicke added projects: Analytics, RESTBase.
GWicke added a subscriber: GWicke.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 19 2015, 1:54 AM
kevinator set Security to None.
kevinator triaged this task as High priority.Aug 20 2015, 5:16 PM
kevinator moved this task from Incoming to Prioritized on the Analytics-Backlog board.
mforns renamed this task from Create a metric for overall RESTBase request rates from Varnish logs to Create a metric for overall RESTBase request rates from Varnish logs [8 pts].Aug 24 2015, 4:19 PM
mforns renamed this task from Create a metric for overall RESTBase request rates from Varnish logs [8 pts] to Create a metric for overall RESTBase request rates from Varnish logs {hawk} [13 pts].
mforns moved this task from Prioritized to Tasked on the Analytics-Backlog board.

Couple of questions:

  • What is the time granularity of the metrics needed - hourly/daily/monthly - etc
  • We should check where /api/rest_v1/ requests show up in webrequest - only in mobile, text, or if anywhere else

Couple of questions:

  • What is the time granularity of the metrics needed - hourly/daily/monthly - etc

Live graphs & selectable time scales similar to grafana would be awesome, but my understanding is that the analytics infrastructure is better set up to do batch analysis. Daily would be good enough to allow us to track the effect of activating new end points & clients. If hourly is possible without much extra effort, then that would be great too.

  • We should check where /api/rest_v1/ requests show up in webrequest - only in mobile, text, or if anywhere else

RESTBase is behind the text varnishes only at this point. I'm not 100% sure if all requests to those Varnishes are logged (if not, we might have to tweak the config), but if they are then it will be for text.

Change 234453 had a related patch set uploaded (by Madhuvishy):
[WIP] Report RESTBase traffic metrics to Graphite

https://gerrit.wikimedia.org/r/234453

@madhuvishy: Did you see any RESTBase requests in the current request logs? I'm not 100% certain that our Varnish setup does indeed send those as well. If they are missing, we might have to tweak the Varnish config a bit.

@GWicke Yes! I'm actually plotting a graph on graphite for today. Will post that in a bit

@madhuvishy: Oh, awesome! Thanks a bunch.

@GWicke - Check out graphite.wikimedia.org - test.restbase.requests. I plotted the first 12 hours of today. (Graph Options->Line Mode-> Connected Line gives a clearer picture)

If you haven't seen the patch, what we're trying to do is to calculate hourly request counts, and send them directly to Graphite from Spark(Without statsd, because it won't accept a timestamp and is meant only for real time stats). This seems to work fine.

Bypassing statsd means we won't get derived metrics like count, min, max etc like you see otherwise in graphite. Let me know if this approach works for you. We also need to fix the name of the metric, because once this is productionized it should go to the restbase namespace, you can pick a name if restbase.requests won't work.

Very nice! I just verified the request rates, and found out that our main third-party consumers (Kiwix and Googlebot) are indeed still hitting rest.wikimedia.org, which are two different varnishes which don't report to the text logs. I'll tell them to switch.

Thanks again!

Change 234453 merged by Joal:
Report RESTBase traffic metrics to Graphite

https://gerrit.wikimedia.org/r/234453

This is done, and the job scheduled on production. Graphite will be updated for hourly numbers starting from Aug 1 2015, and this can be seen on graphite.wikimedia.org under restbase.requests.varnish_requests.