Things like:
- Total service manifests read
- Total services restarted
- A counter for each individual service that was restarted
- Time taken for each run
Things like:
Project | Branch | Lines +/- | Subject | |
---|---|---|---|---|
operations/software/tools-manifest | master | +13 -0 | Send stats about webservices, manifests and errors! |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | None | T90534 Make toolforge reliable enough (tracking) | |||
Resolved | yuvipanda | T90561 Replace bigbrother and ssh-cron-thingy with service manifests | |||
Resolved | yuvipanda | T95210 Review and productionize webservice manifest monitor | |||
Resolved | yuvipanda | T95256 Send metrics from service manifest monitor to graphite |
Change 202318 had a related patch set uploaded (by Yuvipanda):
Send stats about webservices, manifests and errors!
Some more thought needs to be put into how this should be organized, I think. Currently it's organized on a per-host basis, but I think organizing it on a per-tool basis would be a lot more interesting and useful.
(you can see current set of stats by looking at tools.tools-bastion.ServiceMonitor.* in graphite.wmflabs.org)
Change 202318 merged by jenkins-bot:
Send stats about webservices, manifests and errors!