Page MenuHomePhabricator

DNS: dynamically generate entries for service discovery
Closed, ResolvedPublic

Description

As stated in the parent task (T149617) we're going to automatically generate the DNS configuration for the configured services so that they can be discovered querying the DNS directly. The proposed solution follows.

When querying services that are present in multiple datacenters, the response will be generated according to the following schema:

Service capabilityQueryDefault ResponseResponse if default is DOWN
active/activeRO endpointLocal DC endpoint IPRemote DC endpoint IP
active/activeRW endpointLocal DC endpoint IPRemote DC endpoint IP
active/passiveRO endpointLocal DC endpoint IPRemote DC endpoint IP
active/passiveRW endpointActive (from etcd) DC endpoint IPLocal failover IP

The configuration of the DNS will be generated with all the endpoints, and in the case of an active/passive service, when querying the RW endpoint, only the active one will be in an UP state and the other(s) will be in a DOWN state.

The mechanism to update the state file monitored by gdnsd must ensure that at any given time only one endpoint is UP if the service capability is active/passive.

As a failover mechanism in case no endpoint will be available, a valid IP will be returned in any case, to avoid issues with clients not behaving correctly if no DNS answer is returned and the DNS negative cache.
The failover IP will just respond 503s to any request and there will be one for each DC, in order to always respond with the local one.

Details

Related Gerrit Patches:
operations/puppet : productionrestbase: use the dns discovery host for citoid
operations/dns : masteradd first discovery records + mock lint data
operations/dns : masterlinting: remove config-geo-test
operations/puppet : productionauthdns lint support for full puppetized config
operations/puppet : productionDNS: service discovery
operations/dns : mastergeo config structure changes for discovery
operations/puppet : productionauthdns: re-structure prep for discovery

Event Timeline

Volans created this task.Jan 24 2017, 5:16 AM

Change 331789 had a related patch set uploaded (by Volans):
[WIP] DNS: service discovery

https://gerrit.wikimedia.org/r/331789

Krinkle moved this task from Inbox to Radar on the Performance-Team board.Jan 26 2017, 10:28 PM
BBlack added a comment.EditedJan 30 2017, 5:23 PM

We should probably divorce the RO/RW distinction from the core design here. Not all services will have an RW/RO distinction (I would expect most not to), and those will be things we try to eliminate with better (active/active) design over time. if a specific services needs a split into "active/passive RW + active/active RO", we can solve that by calling it two separate services at this level: foo-rw and foo-ro, with different active/passive rules and distinct failover.

if specific services needs a split into "active/passive RW + active/active RO", we can solve that by calling it two separate services at this level: foo-rw and foo-ro, with different active/passive rules and distinct failover.

+1 to using names throughout. Plain A/CNAME lookups are a lot easier to use, and encoding the ro/rw bit in the name calls out the distinction (where needed) clearly.

Change 340154 had a related patch set uploaded (by BBlack; owner: BBlack):
geo config structure changes for svc discovery

https://gerrit.wikimedia.org/r/340154

Change 340156 had a related patch set uploaded (by BBlack; owner: BBlack):
authdns: re-structure prep for discovery

https://gerrit.wikimedia.org/r/340156

Addshore removed a subscriber: Addshore.Feb 28 2017, 5:53 PM
Krinkle removed a subscriber: Krinkle.Feb 28 2017, 8:24 PM

Change 340156 merged by BBlack:
authdns: re-structure prep for discovery

https://gerrit.wikimedia.org/r/340156

Change 340154 merged by BBlack:
geo config structure changes for discovery

https://gerrit.wikimedia.org/r/340154

Change 331789 merged by BBlack:
[operations/puppet] DNS: service discovery

https://gerrit.wikimedia.org/r/331789

Change 341564 had a related patch set uploaded (by bblack):
[operations/puppet] authdns lint support for full puppetized config

https://gerrit.wikimedia.org/r/341564

Change 341573 had a related patch set uploaded (by bblack):
[operations/dns] linting: remove config-geo-test

https://gerrit.wikimedia.org/r/341573

Change 341574 had a related patch set uploaded (by bblack):
[operations/dns] add first discovery records

https://gerrit.wikimedia.org/r/341574

Change 341564 merged by BBlack:
[operations/puppet] authdns lint support for full puppetized config

https://gerrit.wikimedia.org/r/341564

Change 341573 abandoned by BBlack:
linting: remove config-geo-test

https://gerrit.wikimedia.org/r/341573

Change 341574 merged by BBlack:
[operations/dns] add first discovery records mock lint data

https://gerrit.wikimedia.org/r/341574

Change 343926 had a related patch set uploaded (by Giuseppe Lavagetto):
[operations/puppet] restbase: use the dns discovery host for citoid

https://gerrit.wikimedia.org/r/343926

Change 343926 merged by Giuseppe Lavagetto:
[operations/puppet] restbase: use the dns discovery host for citoid

https://gerrit.wikimedia.org/r/343926

Joe closed this task as Resolved.Apr 3 2017, 6:38 AM