Page MenuHomePhabricator

Choose a consistent, distributed k/v storage for configuration management/discovery
Closed, ResolvedPublic

Description

We strongly need a consistent key-value store to use in cluster discovery/coordination and configuration.

Potential candidates with substantial adoption, development community health, and feature-completeness are at the moment in my opinion:

  • etcd
  • consul
  • Zookeeper

The chosen product should at least:

  • Guarantee read-availability during a node failure, and consistency and recovery after a failure to be easily understandable and manageable
  • have a decent write performance, but an extremely good read performance with very low latencies in all operating conditions
  • Allow clients to watch the value of a key/ a tree for changes
  • Allow easy backups
  • Work cross-datacenter (even with some limitations)
  • Have clean client libraries in most languages we use at the WMF
  • Allow (force?) encrypted connections from clients

Bonuses:

  • Easy to query from the CLI
  • Allow some level of authentication/grants system
  • Packaged in debian

Event Timeline

Joe raised the priority of this task from to Needs Triage.
Joe updated the task description. (Show Details)
Joe added a project: acl*sre-team.
Joe subscribed.

Out of curiosity, since analytics already uses zookeeper for hive/kafka, maybe it should be given a try first and other solutions looked at if zookeeper does not match our needs. That would be one less technology introduced to the cluster. 0.02€

Andrew triaged this task as Medium priority.Apr 11 2015, 9:27 PM
Andrew set Security to None.

ZK needs for analytics are completely different from the ones we have here, or I would have surely followed that path, @hashar.

But given analytics is basically doing apt-get install zookeeper and leaving it to be managed/interacted with from hadoop, it would be like choosing to use apache to serve wikipedia because we use it for serving gerrit.

@Joe thank you for the explanation.

Probably one thing we should also think about is integrating config management with rolling deployments, where a subset of nodes might need different (new) configuration directives. Example: a service's v1 is deployed using configuration cA on 10 nodes, but the new version of the service, v2, uses config cB. The differences between cA and cB might be miscellaneous (added/removed/changed keys, new format, etc), but it is easily conceivable that feeding cA to v2 or cB to v1 might bring the service down (or cause it to malfunction). In a rolling-deployment scenario, we usually have only a subset of machines using the new version (until confirmed it worked) at any given point in time:

  • 3 machines running v2 using cB
  • 7 machines running v1 using cA

We, thus, need a way to provide both configs, or ensure that each machine is using the correct one. Current approaches to config management include:

  • putting the configuration directly with the code to be deployed
    • good: factors in config version changes, can be changed at will by devs
    • bad: need one config per environment where everything is practically hard-coded, and, well, can be changed at will by devs :)
  • putting the configuration in ops/puppet
    • good: less hard-coded config directives, can be adapted dynamically based on the environment (prod, labs, beta, staging, etc), better config supervision as opsens need to +2 it
    • bad: once merged, the config is installed on all concerned machines regardless of their state (in terms of cA-vs-Cb needs), and, well, opsens need to +2 it even for ultra-small changes

It follows that neither is entirely acceptable. With that in mind, I am not sure which approach needs to be taken wrt the discovery mechanism. How should it be fed the config - read it from puppet or let the service give it on start-up? How can we ensure rolling deploys will work with it in place?

I realise breaking config changes are evil and ideally should not happen at all. I'm kind of more thinking out loud here and fishing for other people's thoughts on this.

@marko I think all what you state here is something that will be enabled by this software, once integrated with our tools (salt, puppet, pybal, etc.).

The specifics will need to be ironed out for sure, but not in the ticket about the configuration store probably :)

In T95656#1221186, @Joe wrote:

@marko I think all what you state here is something that will be enabled by this software, once integrated with our tools (salt, puppet, pybal, etc.).

Good, thnx.

The specifics will need to be ironed out for sure, but not in the ticket about the configuration store probably :)

Probably, but was just putting it out there for consideration (better safe than sorry)

Out of curiosity, since analytics already uses zookeeper for hive/kafka, maybe it should be given a try first and other solutions looked at if zookeeper does not match our needs. That would be one less technology introduced to the cluster. 0.02€

Just adding my own $0.02 here since this keeps coming in related irc/email conversations: I don't think analytics use of ZK is much of an argument here, either. What we're looking to do here is a very specialized thing that will be deeply integrated with some of our front-line / outage-sensitive infrastructure, and the requirements are completely different in terms of interfaces, data size/schema, geographic/replication issues, fault/isolation tolerance, etc...

I don't think analytics use of ZK is much of an argument here, either.

+1. we have 3 ZK servers that are used by Kafka for leader election, and for occasional non-production consumer offset management. Zookeeper works great, but that is because Kafka has been coded to work with it. I have a hunch that it would be a pain to use for these other opsy things.

Since no one really complained about my evaluation, we'll go on with etcd for now.

I was really just wondering about pre existing usage of ZeroKeeper. @Joe promptly addressed it at T95656#1220342 :-]

Welcome etcd!