Page MenuHomePhabricator

Eliminate SPOFs in Labs infrastructure
Closed, ResolvedPublic

Description

One server going down shouldn't bring down all of Labs. Identify and have hot/warm spares for all services.

  1. Puppetmaster / Nova Controller
  2. labnet1001
  3. Live / Cold migration for instances(?)
  4. labstore***
  5. labsdb***?

This only counts https://wikitech.wikimedia.org/wiki/Labs_labs_labs#Wikimedia_Labs and not toollabs nor betacluster

Event Timeline

yuvipanda raised the priority of this task from to Needs Triage.
yuvipanda updated the task description. (Show Details)
yuvipanda added a project: Cloud-Services.
yuvipanda added subscribers: Aklapper, yuvipanda.

This is about Labs infrastructure - beta does not count.

yuvipanda set Security to None.
Andrew claimed this task.
Phabricator_maintenance renamed this task from Eliminate SPOFs in Labs infrastructure (Tracking) to Eliminate SPOFs in Labs infrastructure.Aug 13 2016, 9:17 PM