Set up sufficient monitoring for toollabs
Right now we don't have anything other than basic host / disk space checks.

Event Timeline

Do uptime/downtime stats fall under this task?

No, this is primarily about getting notified when something of tool labs is down. Keeping historic information and making stats from them would be nice in general for our monitoring, but AFAIK is also not there for production. Some monitoring tools do that at the same time (certain anomaly detection requires historic information).

