Page MenuHomePhabricator

Create a service location / discovery system for locating local/master resources easily across all WMF applications
Closed, ResolvedPublic

Description

As we work on making the switchover process as easy as possible it appears pretty clear to me that everything is going to be easier if we implement something like https://gerrit.wikimedia.org/r/#/c/266509 to work across all applications, and not just for mediawiki.

I'm unsure this should be a blocker, but it would definitely make things way easier for ops when doing the switchover, and would standardize definitions across different applications.

In short, we need some sort of simple, even rudimentary discovery system, and maybe integrate it into our applications.

Simple forms of this could be:

  1. Add dns records for all services in the various DCs, plus a set for the master
  2. Create a json/yaml file containing all the definitions and distribute it via puppet
  3. Add a series of entries to conftool and store the data into etcd. It can then be polled directly from applications, or just create a json file via confd

Event Timeline

Joe raised the priority of this task from to Medium.
Joe updated the task description. (Show Details)
Joe added subscribers: faidon, Aklapper, Joe.
ori added a subscriber: aaron.

Some comments:

Network level switching

While the way we use LVS does not easily let us switch backends across DCs, there might be other means of switching traffic at the network level. A big plus of switching at the network level would be atomic switching, and the guarantee that all users are switched at once. However, it sounds like network-level switching isn't feasible short-term.

etcd watches

In theory, etcd watches could offer low-latency updates of configuration variables like the MW Action API URL. Depending on how this is set up, this could however result in many dozens to hundreds of watches. Is this doable in etcd?

DNS vs. etcd polling

An alternative is DNS or etcd polling. Both don't offer instant switching, but especially DNS has the advantage of being very non-invasive.

A concern with DNS is buggy clients ignoring TTLs, which would be exposed by relying on it for switching. Those bugs are worth fixing in any case though, and tend to be more common with specialized / less critical libraries like metric reporters. To establish whether DNS is feasible or not, we need to explicitly test switching / TTL handling for the MW Action API accesses.

Config updates & rolling restart

A manual and relatively slow method would be to deploy a new config to all services. This involves rolling restarts per service.

I looked a bit into DNS as an option. By default, nodejs uses the c-ares library for asynchronous DNS resolution, and does not cache the results at all. There are add-on libraries that can add caching, but we aren't using any of those at the moment. It might actually make sense to start using one of them, with a very short TTL of a couple of seconds.

From looking at /etc/resolv.conf, our servers in eqiad don't seem to be set up with a local caching DNS resolver, so all of those requests should hit the configured DNS servers.

This leaves libraries. The request library we are currently using for http requests does not perform any caching of its own either (it's a wrapper around node's http module)

Overall, it looks like services should pick up a DNS change for the MW API almost instantly. @Joe, is there a good way to test updating a DNS entry in production, to make sure that no DNS caching intervenes? I can set up a small node script performing periodic http requests, printing the response to check how quickly the HTTP target follows the DNS change.

I have now verified that there is no DNS caching at all for HTTP requests in Parsoid, and thus more generally node services. A packet dump clearly shows many repeat DNS requests for en.wikipedia.org when parsing a large article in Parsoid. With the dnscache module added, those disappear.

So, unless there is some transparent and broken caching between the hosts and the configured central DNS servers, switching the MediaWiki API IP used by node services should work reliably and quickly using DNS.

@Joe, I think we can move ahead with DNS, especially for node services. For the first iteration, I'm guessing the plan is to manually update DNS records?

Longer term, an etcd-backed DNS and possibly HTTP service would be helpful. Examples would be SkyDNS or Consul. What is your view on this?

I'd really prefer it if we would avoid a DNS-based solution. If it's too late for using etcd, I'd honestly prefer local reconfiguration (like MediaWiki) instead of DNS for our interim solution.

Currently, local reconfiguration requires service deploys and -restarts, so would be slower and more complex than DNS or other service discovery mechanisms.

Could you elaborate on why you would prefer to avoid DNS, considering that we verified caching not being an issue for node services?

Currently, local reconfiguration requires service deploys and -restarts, so would be slower and more complex than DNS or other service discovery mechanisms.

Could you elaborate on why you would prefer to avoid DNS, considering that we verified caching not being an issue for node services?

Well just one reason you're already mentioning: node services aren't the only services we need to support. And it may not even be necessarily the case for *all* node services in the future, either.

In my opinion, we should create a system that can integrate well with the following needs:

  1. Be easily integrated within puppet
  2. Be exposed in a simple way on-disk on all systems (env variables could be a decent and universal-enough format, but I'd think about json)
  3. Be querable via DNS, preferably as SRV records
  4. Can be watched for changes by applications that are designed to take advantage of that

I think we can easily add to conftool/etcd the support for such a scheme.

What should this discovery system expose?

I think in general we'd need to distinguish between local and master resources, so that any client can require either one.

So, for each of those (if a distinction exists) we should expose:

  1. The hostname / IP of the resource
  2. A fully parsed url as a data structure / separate env variables (as in python's urlparse format or something similar)
  3. A full url for the service

all of this could be easily translated in the abovementioned formats.

I think DNS resolution shouldn't be relied upon in general because it makes the discovery-backed dns a single point of failure.

Over the weekend, I hacked on conftool a bit and I think it would be not that hard to integrate such a functionality in conftool.

So, to circumstantiate my ideas a bit more:

  • Services will be held in conftool in the form ../discovery/global/<service_name> with the following data:
datacenters: [eqiad,codfw]
route: local|eqiad|codfw

where route indicates if requests should by default be directed to the local dc or the master

  • Local services will be records in the form .../discovery/svc/<service_name>/<datacenter>
  • Content will be in the form:
# discovery/svc/graphite/eqiad
scheme: "https"
host: "graphite1001.eqiad.wmnet"
port: "9090"
path: "/collect"
...

This can easily be translated in on disk data, either in json form or as environment variables. It would be enough for applications to read this file for each request/watch it for changes and just re-read the dictionary that will be in the form:

# example for a server in codfw
restbase:  { 'default': https://restbase.svc.codfw.wmnet:7231/v1/api }
...
mediawiki_api: {
  default: https://api.svc.eqiad.wmnet:80/w/api.php
  readonly: https://api.svc.codfw.wmnet:80/w/api.php

}

Such a structure would be easy to interpret both for simple applications (that will use the default url) and complex active-active applications that will be able to know both the local and the master resource urls and react based on that.

These data could also be easily plugged into skydns or pdns in simple form like:

dig +short -t SRV restbase.svc.discovery
;; ANSWER SECTION:
restbase.svc.discovery 3600 IN SRV 10 0 7231   restbase.svc.eqiad.wmnet

or as simple A records

dig +short eqiad.restbase.svc.discovery
restbase.svc.eqiad.wmnet
...

I will work on a stub implementation of this.

Joe removed Joe as the assignee of this task.Oct 5 2016, 7:56 AM
Joe added a project: User-Joe.

For all use cases I can think of in services regular A / CNAME records would actually be easier to consume. We generally use known, fixed port names for specific services, so don't have a need for port discovery. With regular A / CNAME records, regular DNS lookups would just work, without a need to change any logic in services.