We discovered that T175123 prevents tiles from being re-generated when updates are received from OSM. This can be seen on grafana. Putting in place an alert on that graph should be possible.
Note: at the moment, tile generation is aggregated for all clusters, which makes alerting on it not all that great. It is probably possible to prefix metrics with the cluster name, but some investigation is needed.
Note: since tile generation is done at relatively low frequency (daily at the moment), this needs to be taken into account to tune the alerting. Response time on those alerts is also fairly un-demanding: not generating tiles for a few days does not cause any major issue.