Display maps server-side usage metrics on maps dashboard
Closed, ResolvedPublic2 Story Points

Description

In addition to the client-side events recorded in the logs, we need to know how much actual server usage the maps service consumes. For that, we need to visualize Varnish logs. The simplest metrics would be:

  • total tile requests
  • tile requests per style, e.g. "osm", "osm-intl", ...
  • tile requests per style per zoom, e.g. "osm-z10", "osm-z11", ...
  • counts per referer header (we could show per domain or even per wiki article if available, or per tool)

This will allow us to have a clear picture of the usage, and have a better picture to plan ahead.

Yurik created this task.Sep 11 2015, 9:44 PM
Yurik updated the task description. (Show Details)
Yurik raised the priority of this task from to Needs Triage.
Yurik added projects: Maps-Sprint, Discovery.
Yurik added subscribers: Yurik, Ironholds, MaxSem, Tfinc.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 11 2015, 9:44 PM

Thanks @Yurik. Usually I would think about these kind of metrics being in ganglia or grafana but @MaxSem tells me that we can't connect those tools to a Varnish stream. Let's work with @Ironholds to find out what they would need in order to surface these data points. This request would cover the server side metrics and is separate from our existing client side metric KPI's. Implementing these should not block launch.

Yurik moved this task from Backlog to Stalled/Waiting on the Maps-Sprint board.Sep 15 2015, 9:29 PM
Tfinc moved this task from Needs triage to Analysis on the Discovery board.Sep 15 2015, 9:29 PM

This data is available, we just need to write the retrieval scripts and dashboards.

Our existing approach is that "Maps-Sprint" contains the tasks the Maps team will be working on; if this is an analytics task (it appears to be) it should just be in the analytics backlog, and it should have a clear signoff on it. Yuri says "we need"; what does "we need" from an engineer's standpoint look like in terms of the team's entire needs? Tomasz, do we need this? How soon? How high-priority is it for the team?

@Ironholds We can talk about this in the sprint planning on Thursday.

@Ironholds yes, we'll need this to complete our picture of the maps pipeline.

@ksmith What's the preference for one team wanting to keep track of an upstream tasks while another takes it on? Keeping it on maps sprint board but in the waiting section allows the team who's requesting it to easily see all the tasks that are blocking them rather then having to go to each upstream sprint board to track dependencies.

Deskana renamed this task from Display maps Varnish usage on discovery dashboard to Display maps server-side usage metrics on maps dashboard.Sep 17 2015, 9:35 PM
Deskana set Security to None.
Deskana triaged this task as Normal priority.Sep 22 2015, 8:19 PM
Deskana moved this task from Analysis to On Sprint Board on the Discovery board.Sep 22 2015, 8:23 PM
mpopov claimed this task.Sep 22 2015, 11:58 PM

Working on a script for tile requests.

Yurik added a comment.Sep 23 2015, 4:40 AM

@mpopov, awesome, thanks! Will it be possible to do drill-downs? E.g. show total tile request counts, or to switch and show total tile requests per style (we currently have /osm/ and /osm-intl/ styles, but it might be more? Just wondering if it is easy to do with the dashboard setup we have. Thanks!

@Yurik I'll probably just do osm and osm-intl separately from the start since the total is just their sum.

Yurik added a comment.Sep 23 2015, 4:57 PM

Does each style name has to be hard coded, or does dashboard support
multiple unspecified series?

What's the preference for one team wanting to keep track of an upstream tasks while another takes it on? Keeping it on maps sprint board but in the waiting section allows the team who's requesting it to easily see all the tasks that are blocking them rather then having to go to each upstream sprint board to track dependencies.

@Tfinc: A common practice within the foundation is for teams to have a "Radar" or "Watching" column, which contains stories that the team is not working on, but which they want to keep in view. If there are very few external tasks that the team wants to watch, it might not be worth setting up a Radar column. That's a judgment call.

Note that it should not be used for tasks where this team will be doing any work. (In that case, either the task should be split into one task per team, or it should be in "Stalled" while the other team is doing its part).

I'm not aware of any better practices.

Moving to the top of the backlog, as it's just waiting on us right now.

Change 241261 had a related patch set uploaded (by Bearloga):
Adds script for fetching server-side tile statistics.

https://gerrit.wikimedia.org/r/241261

mpopov moved this task from Needs review to Backlog on the Discovery-Analysis (Current work) board.
mpopov edited a custom field.

Change 241261 had a related patch set uploaded (by Bearloga):
Adds script for fetching server-side tile statistics.

https://gerrit.wikimedia.org/r/241261

Deskana closed this task as Resolved.
Deskana moved this task from Stalled/Waiting to Done on the Maps-Sprint board.
Yurik added a project: Maps.Nov 7 2015, 7:34 AM