Online Fundraising currently uses a Civicrm database replica on frdb1003 to confirm that donation processing is working during live campaigns. While this is effective, other development activity on that database often delays replication or even knocks it offline. This happened recently and led to a delay in starting a campaign, see T255544. We should be able to build a robust and user friendly alternative by collecting metrics from a reliable database instance into prometheus/grafana.
Note we already have a database scraper, mentioned here T176295, that could be retooled to collect query-based metrics for prometheus.