Page MenuHomePhabricator

BlazeGraph Finalization: Zookeeper
Closed, ResolvedPublic

Description

Figure out how we'll be using zookeeper. We should answer these questions:
Run it on the box running BlazeGraph?
On three VMs (assuming we have the virtualization infrastructure)?
Share with analytics?
Why did _joe_ make sad noises when he heard it was required?

Event Timeline

Manybubbles assigned this task to Joe.
Manybubbles raised the priority of this task from to Needs Triage.
Manybubbles updated the task description. (Show Details)
Manybubbles set Security to None.

First of all, sorry If I did not get back to you earlier.

I don't like the idea of having a complex tool like zookeeper running just to ensure HA. This is actually pretty bad for me, but still not a blocker per se.

We surely don't want to run Zookeeper in share with analytics, we do have different needs/usage patterns and we surely don't need to cross our work with them.

We also don't want something as important as Zookeeper on VMs, IMO. At least not until our virtualization infrastructure is a bit more tested out (right now we're just starting to build it in codfw).

So I'd say that as long as we plan from the start to have 1 master and N slaves for BlazeGraph with N>1 (a wise choice anyways) we can co-host zookeeper on the same machines, if this doesn't starve system resources in some way. I'll have to investigate that but I guess that's a pretty common usage pattern.

In T90109#1058712, @Joe wrote:

First of all, sorry If I did not get back to you earlier.

I don't like the idea of having a complex tool like zookeeper running just to ensure HA. This is actually pretty bad for me, but still not a blocker per se.

We surely don't want to run Zookeeper in share with analytics, we do have different needs/usage patterns and we surely don't need to cross our work with them.

We also don't want something as important as Zookeeper on VMs, IMO. At least not until our virtualization infrastructure is a bit more tested out (right now we're just starting to build it in codfw).

So I'd say that as long as we plan from the start to have 1 master and N slaves for BlazeGraph with N>1 (a wise choice anyways) we can co-host zookeeper on the same machines, if this doesn't starve system resources in some way. I'll have to investigate that but I guess that's a pretty common usage pattern.

In HA our usage of ZK is a little different than systems like Hadoop, SolrCloud, etc. We only use ZK for the leader follower elections and we use Apache River for the distributed state transfer, i.e. writes/updates are not sent through ZK. As such, it's likely that you won't see issues with co-hosting ZKs on the same machines. We are doing internal testing with the effects of running an embedded ZK as our usages is fairly light.

@Beebs.systap to be more explicit, it's highly probable we won't use ZK as our distributed, consistent KV store of choice internally, so maintaining a separated ZK cluster for blazegraph HA only would be too much of an hassle, hence my desire to co-host it. I also thought that if this raises any concern, we can think of using containers to segregate the two programs and prevent one from interfering with the other.

In T90109#1058981, @Joe wrote:

@Beebs.systap to be more explicit, it's highly probable we won't use ZK as our distributed, consistent KV store of choice internally, so maintaining a separated ZK cluster for blazegraph HA only would be too much of an hassle, hence my desire to co-host it. I also thought that if this raises any concern, we can think of using containers to segregate the two programs and prevent one from interfering with the other.

@Joe: We believe that is a viable approach.

Resolving. We expect to cohost zookeeper on 3ish BlazeGraph nodes, assuming we use BlazeGraph's HA at all.