We have a lot of redundancies in place now, but that's useless if the instances are in the same underlying host. While OpenStack rules can be a solution for this later, put an icinga check in place now to ensure this doesn't happen much.
Description
Description
Details
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
openstack: Add a check to see if Tool Labs instances are spread enough | operations/puppet | production | +158 -0 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | None | T90534 Make toolforge reliable enough (tracking) | |||
Declined | None | T91068 Set up a schedule for doing failover exercises for toollabs | |||
Resolved | Andrew | T90542 Make sure that toollabs can function fully even with one virt* host fully down | |||
Resolved | bd808 | T91072 Move toollabs instances around to minimize damage from a single downed virt* host | |||
Resolved | yuvipanda | T101635 Write an icinga check to ensure that toollabs instances are appropriately distributed across labvirt** hosts |
Event Timeline
Comment Actions
Change 216661 had a related patch set uploaded (by Yuvipanda):
openstack: Add a check to see if Tool Labs instances are spread enough
Comment Actions
Change 216661 merged by Yuvipanda:
openstack: Add a check to see if Tool Labs instances are spread enough
Comment Actions
PROBLEM - Tool Labs instance distribution on virt1000 is CRITICAL master class instances not spread out enough done!