Page MenuHomePhabricator

RESTBase dashboard annotations for deployments (and more)
Open, MediumPublic

Description

Grafana seems to support two different sources of annotations, Graphite Events, and Elasticsearch queries.

Graphite events seem to be working in our environment (see here for a test event I created), but since there does not seem to be any precedent, it's probably worth checking with SRE before committing to it. Additionally, creating events requires that we make a POST, which makes things a little more difficult, in that we need to provide credentials (but maybe SRE can provide alternatives). If we went this route, it would make the most sense to integrate this with ansible-deploy.

The other alternative would be to use Elasticsearch. Something seems wrong with our Grafana install, because it should be possible to enter an index name and lucene search query to source events from. If this were working, I wonder if it wouldn't be possible to construct a query from @bd808's nifty ES-based SAL?

Event Timeline

Eevans raised the priority of this task from to Needs Triage.
Eevans updated the task description. (Show Details)
Eevans added a project: RESTBase.
Eevans subscribed.
Eevans renamed this task from RESTBase deploy counter to RESTBase dashboard annotations for deployments (and more).Oct 28 2015, 8:02 PM
Eevans triaged this task as Medium priority.
Eevans updated the task description. (Show Details)
Eevans set Security to None.
Eevans updated the task description. (Show Details)
Eevans added subscribers: bd808, ori, fgiunchedi and 2 others.

IIRC what scap does for deploy events is push a "1" to particular metrics under deploy. in graphite and that's it, grafana picks it up via "Graphite target expression", would that suit?

IIRC what scap does for deploy events is push a "1" to particular metrics under deploy. in graphite and that's it, grafana picks it up via "Graphite target expression", would that suit?

The docs suggest that an event-based annotation can enrich the hover with event data ("When you hover over an annotation you can get title, tags, and text information for the event."). This didn't seem to work for the one Graphite-based event I created, but if it could be made to work, it would be awesome.

Even for something as simple as a deploy, you could imagine including the change ID, and/or a simple comment.

And, I'd really like to go beyond just deploys, creating annotations for matching SAL entries, Icinga alerts, etc.

TL;DR a deploy counter would be better than nothing, but ideally we'd be able to use events as well

To summarize a discussion with @bd808 on IRC:

  • The data exists in an elasticsearch instance in labs
  • timestamp, nick, message, and project (prod, releng, etc) are indexed (annotations would search against terms matching message)
  • The labs ES instances however, are not reachable outside of labs (Grafana cannot query them)
  • Requests proxied by the SAL webapp (https://tools.wmflabs.org/sal/production), are one option

To expand a bit on that last item, Grafana supports ES and Graphite metrics or event queries. The latter is a fairly simple REST interface:

curl -s "http://graphite.wikimedia.org/events/get_data?tags=restbase" | jq
[
  {
    "data": "",
    "what": "rolling deploy of restbase-deploy 3b1f6488f2 to restbase cluster",
    "when": 1446060510,
    "id": 1,
    "tags": "restbase restbase-deploy"
  }
]

So one option that might be possible is to implement an /events/get_data endpoint for SAL, that returns the expected JSON formatted results, and then configure SAL in Grafana as a Graphite datasource.

untagging observability due to staleness