One server going down shouldn't bring down all of Labs. Identify and have hot/warm spares for all services.
- Puppetmaster / Nova Controller
- labnet1001
- Live / Cold migration for instances(?)
- labstore***
- labsdb***?
This only counts https://wikitech.wikimedia.org/wiki/Labs_labs_labs#Wikimedia_Labs and not toollabs nor betacluster