Page MenuHomePhabricator

Graph reports status in Prometheus
Closed, DeclinedPublic

Description

From T250053:

If possible though, I would like to ability to pull some data around what we're doing. For example, being able to go back and pull the number of Netbox errors per week, per month, per site, to see how we're trending - are we improving, what are the numbers, which users are creating the most errors, average time to fix, etc. I think the trending from this data would show a more complete story vs. the general perception that the Netbox reports are constantly in a failed state.

Something similar has been done for T243927. And reports status are exposed in the API (eg. https://netbox.wikimedia.org/api/extras/reports/accounting.Accounting/)

So having graphs of failed tests per reports (eg. test_missing_assets_from_accounting failure = 28, test_field_match failure = 0) seems easy to do.

Per site might be more difficult, we would need to go fetch the site linked to a device in the log array (if any).