Switchover to Grafana 6 planned for Monday Nov 25.
Grafana 6 adds a bunch of cool new stuff:
- "Explore", a UI for playing around with monitoring data, and for exploring what metrics are in Prometheus. It looks ideal for either constructing queries for a new console, or for doing incident response across multiple systems (and then linking in your postmortem). https://grafana.com/docs/features/explore/
- "ad hoc filtering for Prometheus", which looks like a big improvement on the template variable stuff we do on many consoles right now. https://grafana.com/docs/guides/whats-new-in-v6-1/
- better panel editor UI
Things to watch out for per https://grafana.com/docs/installation/upgrading/#upgrading-to-v6-0
- check for any text panels with embedded <script> tags
- figure out what to do re: the frontpage panel that has this; also the puppet panel that iframes to puppetdb is broken -- maybe we should just disable sanitization? https://github.com/grafana/grafana/issues/15392
- yes, let's just disable sanitization
- add cookie_secure = true to the [security] section
- take a look at the new login/session support and settings for such https://grafana.com/docs/auth/overview/#login-and-short-lived-tokens
- the defaults should do fine for us.
Plan is similar to last time:
- create new Ganeti VM grafana1002.eqiad.wmnet -- will attempt using the new automation for such, and probably also try using buster
- point grafana-next.wikimedia.org to that host
- copy a snapshot of the database and ask the same groups as in T210416 to test
- pick a time to make -next the new normal (should be just a few minutes of readonly)
Fundraising firewall should not be a concern this time as they have their own grafana now.
It's also worth looking at testing the upgrade on grafana-labs if they're interested