Several Grafana dashboards use scap logs to show when e.g. train runs happen, which is extremely useful for correlating metrics changes with MediaWiki changes. This happens via the via the Public Logs / Loki data source with a query like {channel="scap"} |~ "rebuilt and synchronized wikiversions files". This has stopped working a while ago (weeks? months? I don't remember when I first noticed).
Description
Event Timeline
I see logs returned using the attached query in Loki, but the log volume panel in Explore complains:
Failed to load log volume for this query parse error at line 1, col 122: syntax error: unexpected IDENTIFIER
We are running an older version of Loki - perhaps some change has broken things on the grafana side.
The original problem persists after upgrading loki to 2.8.11, but the explore panel is fixed in the new version.
Further digging revealed that this is an upstream grafana bug: https://github.com/grafana/grafana/issues/110265
In short, the annotation toggles no longer work when they are default-off. If the dashboard has them enabled via the dashboard settings (Edit->Settings->Annotations->Deploys->Enabled ☑), the annotations reveal themselves.
We'll move forward with the loki upgrade regardless since all the work has been done to rebuild the package.
Mentioned in SAL (#wikimedia-operations) [2025-10-09T16:33:30Z] <cwhite> upgrade grafana-loki on grafana hosts T406478
Thanks for investigating!
If there are no concerns around overloading Loki, just making these annotations enabled by default would be a fine workaround IMO. They can be a bit slow, but they don't block anything else while they are loading, and Prometheus is itself a bit slow anyway.