Notifications based on the rate of HTTP status 500 responses from Kask would be an excellent indicator of user-facing problems.
See also:
- https://wikitech.wikimedia.org/wiki/Incidents/2023-01-24_sessionstore_quorum_issues (incident which would have benefited)
- T320401: Alert on Kask error rate (related/similar)