With the new SGE-based grid, this grafana dashboard may need to be revisited:
https://grafana-labs.wikimedia.org/dashboard/db/tools-basic-alerts
With the new SGE-based grid, this grafana dashboard may need to be revisited:
https://grafana-labs.wikimedia.org/dashboard/db/tools-basic-alerts
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | fgiunchedi | T177195 Reduce technical debt in metrics monitoring | |||
Resolved | fgiunchedi | T177196 Port non-deprecated Diamond collectors to Prometheus | |||
Resolved | • GTirloni | T207591 tools-services: Migrate to Stretch | |||
Resolved | bd808 | T211684 Toolforge: Port sge.py stats to Prometheus | |||
Resolved | • Bstorm | T215845 Add monitoring for disabled grid nodes to the prometheus collector | |||
Open | None | T213567 Toolforge: refresh grafana dashboard |
I don't know how to login into grafana-labs. WIll need to sort out my credentials first.
Also, let's relate this task to T215845: Add monitoring for disabled grid nodes to the prometheus collector.
On IRC:
Krenair> arturo, go to https://grafana-labs-admin.wikimedia.org and use the basic auth prompt with LDAP CN + pass
I started building some initial graphs about Toolforge here: https://grafana-labs.wikimedia.org/dashboard/db/arturo-toolforge-dashboard?orgId=1&from=now%2Fd&to=now
@JHedden I believe you added new prometheus exporters for toolforge? May I have a hint about them? so I can integrate into my dashboard.