Page MenuHomePhabricator

Labs team reliability goal for Q1 2015/16
Closed, ResolvedPublic

Description

Tracking task for Labs team reliability goal for Q1 2015/16.

  1. Meet or exceed 99.5% uptime for each Labs infrastructure service
  2. Remove all Labs support host SPOFs, using redundancy or hot spares
  3. Finish NFS migration to RAID10 storage, and implement NFS sharding
  4. Audit Labs projects on NFS dependencies and support migration to alternatives where appropriate

Related Objects

StatusAssignedTask
Resolved yuvipanda
ResolvedAndrew
Duplicatecoren
Resolvedjcrespo
Resolved yuvipanda
ResolvedAndrew
ResolvedAndrew
ResolvedRobH
Resolvedjcrespo
OpenNone
Resolvedchasemp
OpenNone
Openakosiaris
OpenNone
DuplicateNone
Resolved yuvipanda
OpenNone
ResolvedKrinkle
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolvedhashar
Resolvedhashar
Resolved yuvipanda
Resolved yuvipanda
OpenNone
Declined yuvipanda
Resolved yuvipanda
ResolvedNone
Resolved yuvipanda
ResolvedAndrew
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
OpenNone
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
ResolvedNegative24
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolvedhashar
Resolvedhashar
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
OpenNone
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolved yuvipanda
Resolvedcoren
Resolved yuvipanda
Resolved yuvipanda
Resolvedjkroll
ResolvedKrinkle
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedKrenair
OpenNone
ResolvedBstorm
OpenNone
ResolvedNone
OpenNone
Resolvedjsn.sherman
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedSmalyshev
OpenNone
ResolvedAndrew
OpenNone
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedRobH
ResolvedCmjohnson
ResolvedCmjohnson
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
Resolvedcoren
InvalidNone
Resolvedcoren
Resolvedcoren
Declinedcoren
ResolvedNone
ResolvedNone
Resolvedcoren
Resolvedcoren
Declined yuvipanda
Resolvedcoren
Resolvedcoren
Resolvedcoren
ResolvedCmjohnson
Resolvedchasemp
Resolvedchasemp
Resolvedcoren
Resolvedcoren
Resolvedmark
ResolvedCmjohnson
Resolvedcoren
Resolvedcoren
Resolvedcoren
Resolvedfaidon
Declinedfaidon
Resolvedcoren
ResolvedAndrew
Resolved yuvipanda
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
DeclinedAndrew
ResolvedAndrew
ResolvedAndrew
Declined yuvipanda
ResolvedAndrew
ResolvedAndrew
OpenNone
OpenNone
Declinedcoren
Resolvedjcrespo

Event Timeline

yuvipanda updated the task description. (Show Details)
yuvipanda raised the priority of this task from to Needs Triage.
yuvipanda added a project: Cloud-Services.
yuvipanda added a subscriber: yuvipanda.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 13 2015, 6:14 PM

Not sure about the NFS sharding one - @mark / @coren is that just the tools / others / maps being on different arrays? Or is there more to that? :)

coren added a comment.Jul 21 2015, 7:18 PM

@yuvipanda: It's keeping the filesystem reasonably small (and operations on them more parallelizable) by spliting along project lines, yes. So right now we've spun off tools and maps with everything else together, but making sure that we locate outliers and split them as an ongoing thing is part of this conceptually.

chasemp triaged this task as High priority.Nov 30 2015, 4:25 PM
chasemp set Security to None.
yuvipanda closed this task as Resolved.Dec 16 2015, 10:33 PM
yuvipanda claimed this task.

I guess?