As a first step, to reduce the risk in case the one host goes down.
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Andrew | T143349 Deprecate precise instances in Labs by 2017-03-31 | |||
Resolved | yuvipanda | T94790 Phase out precise instances from Tool Labs | |||
Resolved | coren | T94791 Move tools-master and tools-shadow to trusty | |||
Resolved | Andrew | T103390 Labs: Move tools-shadow off the same host as tool-master |
Event Timeline
Testing how gridengine fares if the shadow and master are not matching version in the versionmismatch project.
Gridengine does not seem to suffer from the version mismatch (6.2u5-4 vs 6.2u5-7.3) and the configuration ports without difficulty.
Therefore, the plan:
- rebuild tools-shadow (precise) as tools-shadow-01 (trusty) in Tools, allow it to configure and stabilize, then switch masters to it.
- After a period of test, we can then remove tools-shadow, create tools-master-01
- Switch to tools-master-01
- Remove tools-master when all is demonstrated well.
If the roles of master and shadow are basically identical and they auto-discover who's in charge, couldn't we name them tools-master-01, tools-master-02, etc. like other services?
There is a subtle difference, at least structurally, in that the gridengine configuration itself makes the distinction. That is, while the shadows can take over the master role, there is one designated server that does not run the monitoring daemon and which is considered the canonical master.
For tools-redis-01 & Co., @yuvipanda used a scheme where the determination of master and server is done by setting $active_redis accordingly. So if something like this is possible with our gridengine setup, I think that would be very useful. However, that's not a blocker for this task.
we are seeing these again in Icinga and a comment there from last time linked to this ticket:
https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=labcontrol1001&service=Tool+Labs+instance+distribution
https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=labcontrol1002&service=Tool+Labs+instance+distribution
I've just checked and tools-shadow is on labvirt1008 and tools-master is on labvirt1004.
@yuvipanda: It's not clear to me why the test is failing - do you have any insight?