This task needs to be broken down further, but writing down the things.
- Classical observability/operational metrics (provided by service-runner)
- Types of notifications that are being spawned
- Enrollment & disenrollment
- Callbacks to notification API endpoints for retrieval
Be aware of potential needs for sampling as this can be very high throughput.
Consider user privacy when designing schema(s), and be aware of where events may be published.
Metrics section from the RFC:
We will track the Four Golden Signals: latency, traffic, errors, and saturation.
Additionally, we will track product-oriented metrics both overall and per-platform, including:
- Subscription request rate (req/s)
- Subscription deletion request rate (req/s)
- Total subscription count
Metrics must be compatible with Prometheus. Alerts will be configured for request spikes or when error rates pass a reasonable threshold.