Please take a look at grafana-beta.wikimedia.org and verify any dashboards you care about are working correctly. (Also note that any changes made on grafana-beta will not be preserved once the upgrade is completed.)
Here's 5.4.0 serving stats from prometheus about its own machine: https://grafana-beta.wikimedia.org/d/000000377/host-overview?refresh=5m&orgId=1&var-server=grafana1001&var-datasource=eqiad%20prometheus%2Fops&var-cluster=misc
5.x includes many UI upgrades (drag and drop rearranging of panels!) and new features (folders! stable URLs! teams!)
Notable new features between 4.6 and 5.4 are below. Recommendations for stuff to change about our configuration are bolded.
- Major UI changes, including drag-n-drop to move graphs around, and a new layout engine that claims easier sizing and placement of panels. Video and screenshots here
- Grafana now has a notion of 'teams', a group of users that can be used in dashboard ACLs, or for setting a default home dashboard. (Looks like automatic linking between Grafana teams and LDAP groups is locked behind Grafana Enterprise, though?)
- Dashboards can now be placed into a hierarchy of folders. We should figure out a hierarchy that makes some sense. (but I'm quite happy to call this part of T178690)
- Dashboard URLs are now stable across name changes. For the time being, old URLs will still work, as long as they are not renamed -- but it's a good idea to update any links into grafana to use the new URL scheme, as the old URLs are deprecated and support will be removed in some future release
- Data sources can now be 'provisioned' -- specified by JSON files, which we could puppetize. This makes them read-only in the UI, which is probably a good idea anyway.
- Similarly, dashboards can be provisioned. Dashboards configured from JSON are editable in the UI, but can't be saved -- instead the UI offers you a JSON dump which you can check back into source control. At first glance, seems not bad to me? Should we start moving 'core' SRE dashboards to JSON in Puppet?
- Heatmap UI and Prometheus histograms now work together
- Grafana has its own native annotations support, including a on-dashboard UI and an HTTP API for adding them
- Grafana-native alerts now support reminders (I believe the performance team wanted this)
Current plan: no known issues with the upgrade. Will announce one last time at the SRE meeting on Monday, then proceed with the upgrade Monday afternoon (US Eastern time).